You had everything planned out: your new video hosting service rocks and people love you from day one. You chose S3 for its cheap traffic and figured ads would cover all expenses. Your traffic grows rapidly and every day more sites link in, bringing swarms of new visitors. Life is just great. Your TV set has just been repossessed by the bank. Hold on, what the hell just happened?
Let's rewind and replay slowly.
Since its introduction, S3 has been used widely by new players for file storage and hosting due its cheap costs and no upfront payments. But not taking AWS Pirates into consideration, S3 can end up very expensive and hazardous to your health.
5 Rules of Thumb: ignore at your peril.
1. Never give anonymous access to your files on S3
There is never a reason to, is there? This directly translates to letting people you don't know consume bandwidth you pay for, without being able to defend yourself. It's not worth it. (Scary thought: what if a competitor wants to drive you broke?)
2. Enable access-logs on your S3 bucket
Track down leechers as soon as possible. Amazon's access log are similar to apache's and are on best-effort only. Keep a record of bandwidth used by each file you are hosting. Block those that go over-quota.
3. When possible, pass Expires to S3
Limit access to S3 storage by signing url and appending an Expires parameter. This will require users to request files through your servers, and not directly from S3; and will give you more control of who gets what and when.
4. Serve files from your own servers
Most hosting packages are equipped with a large bandwidth quota, which can also be expanded later if required. GoDaddy offers additional 500gb traffic for $20 (traffic is calculated as rx and tx combined), that's $0.04/gb, instead of $0.18/gb out.
5. Use reverse-proxy against S3
Harness the power of S3 as a secondary storage platform. Configure a reverse-proxy to download locally unavailable files from S3, and serve locally. Squid has a killer solution with lru caching.