S3 in Business: 6 – A slow interlude

[Complete series]

Resting from the excitement of thinking up an entire new business model (last time), it’s time to do some sums about speed. How practical is it to use S3, or any other Internet service, for backups? Let’s look at how fast data can travel between your PC and Amazon S3.

ADSL stands for “asymmetric digital subscriber line”. A 2Mb ADSL line has an upload speed of 288kb (36kB) per second, with the rest of the speed being used for downloads. That upload speed equates to about 8 hours per gigabyte.

For my 23.4GB iTunes music library, this means a total backup time of 187 hours. Whether you do nocturnal backups (8 hours per night, every night) or daytime backups while you’re away at work (11 hours per day, Monday to Friday) that works out at just over three weeks to back up my entire music library.

(For an 8Mb line the figures are 3.3MB per minute, or 5 hours per gigabyte, adding up to 120 hours, or about two weeks).

This sounds horrible. It’ll be enough to put many people off, but we’re rescued by a couple of facts. First of all, the Tunesafe program is designed to do a little work at a time and it doesn’t mind being interrupted: however long the journey, with one little step after another you’ll get there in the end. The other helpful fact is that iTunes music libraries are quite special things. The individual files within them are not all that big and they don’t change much over time, so that once your library has been backed up it stays backed up. After the initial burst of activity in the first few weeks, the uploads will slow to a trickle and your ADSL connection will be able to handle them easily.

Technical point: If you’ve rummaged inside your music folders then you’ll have seen that whenever you take a track and change something simple like the name of its artist or composer, the entire MP3 file changes. This is why an ordinary file backup program isn’t good enough for iTunes music: change “Mozart, Wolfgang” to “Mozart” and you could let yourself in for a whole extra night’s worth of backups! Tunesafe is cleverer and it doesn’t have to re-upload an entire music track just because you’ve changed the track title or the artist’s name. This is one of the reasons why Tunesafe is worth paying for.

If you ever need to restore from your backup, you won’t have nearly as long to wait. Because ADSL is asymmetric, downloading is a lot faster than uploading – nine times as fast on a 2-megabit line or 17 times as fast on an 8-megabit line.

If you’re not looking at ADSL but at a project where you’re transmitting data to and from an Internet server, everything is a lot faster but you should be aware that Amazon S3 isn’t worldwide yet. To give an idea of the speeds you might expect, uploading from our Rackspace server to S3 took 40 minutes per gigabyte over the transatlantic link from the UK. This is a lot faster than uploading from a PC, but it certainly isn’t instantaneous and it’s worth taking account of it if your business plans are international.

Next time, we’ll start thinking like potential investors in the Tunesafe business. We’ll look on the dark side: what are the risks of depending on Amazon S3?


One Response to “S3 in Business: 6 – A slow interlude”

  1. Chris Says:

    Fortunately, my university has an OC-3 to the Internet (and an OC-12 to Internet2–if only Amazon would strike up a partnership with them!), so my data transfer rate to Amazon S3 might be higher than yours. However, if the limit is on Amazon’s side and the roughly 3.5Mbps upload rate you show is all I’ll be able to get, my calculations show that backing up my 100GB drive (which is 95% full) will take me about two and a half days. (If I could commandeer the entire OC-3, were plugged into a gigabit Ethernet port, and Amazon could take it all, I could theoretically get my entire drive backed up in less than two hours–but I wouldn’t count on that.)

    Two and a half days is still a long time, and I’m bound to take my laptop away with me sometime in those 60 hours and lose the connection. But, it might be acceptable with a service that automatically resumes from where it left off if the connection is disrupted–I think JungleDrive does that. In any case, once the initial data is backed up, incremental changes should create much lower data transfer times.

    I wonder if this is the development that will finally result in a push for higher upload speeds to residential Internet connections…

Comments are closed.

%d bloggers like this: