Coming soon - Get a detailed view of why an account is flagged as spam!
view details

This post has been de-listed

It is no longer included in search results and normal feeds (front page, hot posts, subreddit posts, etc). It remains visible only via the author's post history.

95
Discussion: audio host bandwidth troubles and thoughts about a solution for audio host offloading [technical]
Author Summary
cuddle_with_me is in Technical
Post Body

[This happens to have coincided with a Soundgasm issue. I have been pondering the idea of what to do and how someone could help out as a third-party for a while, which is why I posted this. The idea with posting it was to move the discussion from "a certain site is having issues now and then" to "this is how bandwidth is a problem", "so this could affect other sites too" and "there might be something we could do on a technical level". But I did not post it now to "capitalize" on the current issue, that strikes me a bit like fiddling while Rome is burning and it was not my intent. Personally, I highly endorse the solution of the community pitching in to help with Soundgasm's running costs. But there are still aspects worth exploring.]


There was recently a thread about Soundgasm by u/skitty-gwa, including the proposal to launch a new audio host, Sensoral. The creator of Soundgasm posted a comment, more or less confirming something I've been wondering about, which is that Soundgasm is running up against bandwidth limitations.

If you serve audio files at GWA scale, bandwidth (both costs and limitations) is what kills you - the amount of traffic that is taken up by transmitting the audio file to the listener. Either you get capped, or you get invoiced to hell and back. At Soundgasm quality, 1 minute seems to equal 1 MB with the current audio encoding. There are ~745,000 subscribers to GWA currently, and about five audios posted per hour. Let's assume the average audio is 7 minutes long. If 2.5% (one out of 40) of GWA's subscribers listen to only one (whole) audio per day, that's 18,625 listens, and ~130 GB. For a 30 day month, that's 3.9 TB. Calculated at representative rates at most hosts that charge for bandwidth, this nets you at least $300 per month. This is a (very) low-ball estimate for something of Soundgasm's position, but well within reach surprisingly quickly for an audio host that becomes popular. Paying for the server/virtual machine/hosting itself is, comparatively, peanuts.

The best way around this is to find a server host that has so-called unmetered bandwidth, meaning they don't charge for the bandwidth. I am aware of very few but because this isn't advertising will help if contacted via PM - but this isn't the point either.

(Soundgasm as currently hosted does have unmetered bandwidth, as far as I can tell, but has filled its current bandwidth limits, and does not have another server which also has unmetered bandwidth, with which to share the load.)

The point is: how do the people running an audio host who are not with such a server host get some relief? Switching hosts is a tremendous investment in time, planning and trust. There are more audio hosts out there now than two years ago, as part of the community becoming more diverse. But the risk is that some of them will be running into similar issues at some point in time.

I have spent a lot of time thinking about this, and have come up with a way that could help out, and that could, with some thought put into it, scale reasonably well.


Basically, the main idea is to offload some of the serving of the actual audio files. For any given host, most of the traffic is probably taken up by the currently popular audio files, be they evergreens or today's hot files. So, let's say that every now and then, something in the background on the audio host tallies a few of those most-requested audios, and then sends a signal to an offload server. The offload server receives a list, downloads these files and sends a signal in response, saying "I downloaded these files and until time X, if you want, you can send people to these corresponding addresses on my server instead". For each of those audios, the audio host notes this information, and when someone goes to listen to them, it throws a dice and sends some fraction of people to the offload server instead of to the audio file on its own server.

(This is similar to a custom-designed "CDN", Content Delivery Network, that some people may have heard about.)

This idea fulfills these goals:

  • Reasonably privacy-preserving. That an audio on top of GWA is available to listen to is as public as it can get. Sending over all files willy-nilly would expose a lot of private information. If an audio was deleted, the audio host could send another signal saying hey, this is now removed, please remove the file and stop serving it.

  • Works for public, freely-available audios. Note that this is a really poor idea for non-public or for-pay audios requiring authentication at the audio host. (That said, let's say one of the audios was an unlisted Patreon-ish reward that had a publicly accessible link; having it hosted by the offload server would be about as reliant on people not finding out the link as the original, and wouldn't be worse in that situation, nor would it be better.)

  • Driven by the audio host. It can choose which audios to point out, it can choose whether to serve its own audio file or the offloaded address, and if it does analytics and so on upon playing, it can do those and then send a redirect to the offloaded address in place of beginning to stream the file from itself.

  • Short-term. The offload-server is not supposed to hold onto everything forever, but a few things for a short time period. This increases the odds that the offload server is not itself overloaded.

  • Degrades gracefully. If someone tries to use an offloaded link afterwards, it could redirect to the original listening address at the audio host.

  • Easy to switch off. All the audio host needs to do is to no longer send people to the offloaded address. It could run a check every 30 seconds that tries downloading one of the known offloaded audios, and if that doesn't work, don't use offloading (but if it springs back to life again, start doing it again).

This idea is custom in design, and will require code-level implementation work for the audio host. However, it will be reasonably clear and slide on top of the audio host's other concerns. And if it has been implemented once, it is also reasonably simple to factor it to allow multiple offload servers, which gives a way to scale if just one is not enough. Similarly, one offload server could easily serve multiple audio hosts.

And just to be clear at the end, the two main points are:

  • Instead of the audio file being part of bandwidth that costs money for the audio host, it costs no money for the audio host and either costs less money or nothing at all for the offload server, due to its hosting arrangements.

  • Additionally, not serving the audio file reduces CPU and disk load on the audio host's server as well as lowers the amount of traffic going through at the same time, which can just as well become a bottleneck for a high-traffic server.

I am willing to implement and host such an offload server for the community, and have already produced a draft specification for how the interplay between the offload server and the audio host would work that I can discuss with anyone willing to implement the counterpart.

Obviously, anyone adopting this would put a lot of trust in the offload server to not do the wrong thing, in any number of ways.


(Please note that I am not interested in building an audio host myself; it is a lot more work than the above, and there are already other people doing it. Please support their efforts.)

Author
Account Strength
100%
Account Age
10 years
Verified Email
Yes
Verified Flair
Yes
Total Karma
11,644
Link Karma
2,729
Comment Karma
3,595
Profile updated: 4 days ago
Posts updated: 8 months ago
Cuddly male script writer

Subreddit

Post Details

Location
We try to extract some basic information from the post title. This is not always successful or accurate, please use your best judgement and compare these values to the post title and body for confirmation.
Posted
2 years ago