It’s been almost a year now since I wrote my first post on data backup so I thought I’d do a follow-up and detail the backup scheme I ended up going with.
The original scheme was fundamentally flawed in that the off-site backup was still pretty susceptible to loss, damage, or theft and wasn’t truly off-site in the broad geographical sense. I was simply storing it at a house about three blocks away. Given the spacing (or lack thereof) of buildings in San Francisco, a few blocks means very little considering the ever-present threat of earthquake and/or fire.
So with that in mind, and after reading through the comments of the original post, I took Eydryan’s advice and looked into Backblaze. Backblaze is an online backup service for PC and Mac that allows unlimited data storage for $5/month per machine. This seemed a little too good to be true but I gave it a shot anyways. Much to my surprise, the service not only works, it works flawlessly and is about as dead-simple as anyone could ask for. It’s a little oversimplified for my tastes — being a PC user I’m more accustomed to layers-deep menus with infinite settings and options — but it does its job and does it well.
Backblaze in action (Also available for Mac)
I’ve been on Backblaze for around three weeks now and I’ve pushed up about 400GB of the 987GB total I have set to backup. Obviously the initial backup is pretty slow and depends a lot on your connection speed (I’m on Comcast), but I just allow it to crank away all day in the background and it hasn’t yet interfered much with any of my day to day activities. The internet has been a little sluggish lately while it’s moving everything up for the first time, but it’s a temporary annoyance and well worth it. I suppose I could just run it at night, but I’ve opted to let it go 24/7 to get the initial backup out of the way as soon as possible. I’ll probably be done backing up around the two month mark at which point Backblaze will begin incrementally updating files I’ve changed on my side. All the backups are encrypted so only I can view or access my files, which is a good feeling when you’re posting your life’s work to someone else’s data center.
As for local backup, I’m running an internal mirror drive which I then backup to an external which is still stored off-site. That’s a total of three physical copies of the data to which I have easy access. Backblaze is great for peace of mind in case of catastrophic loss, but when you just screwed up a 3GB PSB and need to go back a version, you really don’t want deal with the downtime involved in pulling it down from a server.
I’ve never felt this confident in terms of data security, the combination of local and online backup is virtually foolproof and gives me the best of both worlds in terms of ease of access and security. It’s actually sort of scary looking back at that first post and realizing how long I lived with the old system, catastrophe was just a careless neighbor away.
How many of you are now using online backup and what services are you using? Let us know in the comments
UPDATE: eydryan has some more info on the subject in the comments
Regular backups should be an integral part of any creative’s computer workflow, unfortunately it seems to be neglected by a lot of people. It seems like an easy thing to do given that the alternative means betting your life’s work on the health of your hard drive but I guess it’s sort of like flossing or taking vitamins. As I’ve recently come to realize, fear of hardware failure isn’t the only reason to backup; fire, flood, theft, and user error all threaten to rob you of your hard earned intellectual property. I’ve always taken backups pretty seriously, but I have had some close calls and a very recent one has compelled me to adopt a more robust backup solution.
Some years ago — shortly before I finished my first album — my main data drive experienced a mechanical failure. Luckily I had a backup drive sitting right above it. So I bought a replacement for the original drive and went about installing it. As I was putting it in the case I accidentally dragged a screwdriver past the IDE pins on the backup drive (which was at that moment the only intact copy of all my work) in just the right way to arc the power connector and fry the controller board. At that moment I thought I had lost everything I ever did, the new album, and my sanity. Luckily the damage was isolated to the control board and I was able to pick up a similar drive and transplant it’s controller and recover my data. I learned a hard lesson that day and every since I’ve been more careful about backing up.
Fast forward to last week when it had recently occurred to me that I should have off-site backups. In a city like San Francisco, fire is a big concern and all the backups in the world can’t help you if they’re sitting in the same place as your data when it all burns to the ground. So I started leaving my backup drive at a friend’s house and bringing it home during the day to backup work from the previous night. The problem is that two weeks had passed since the last time I brought that drive home and backed up. So last week I was partitioning a disk during a Windows install and accidentally deleted the primary partition of my main data drive and that past two weeks of work. Fortunately, Partitions are relatively easy to restore (Active@ makes a very powerful data recovery suite) so this wasn’t a huge deal, but it definitely gave me flashbacks of the near catastrophe I had experienced years earlier and got me thinking I needed to start using a new system.
James E. Gaskin defines a good backup system as “Automatic, redundant, and restoreable” and I would like to add off-site to that list. The system I was using until today only covered only two of those bases. Now, I would love to use an online backup service — it would solve all of these problems — but I have about 1.5TB of files that need to be mirrored and a typical night of work will generate around 2GB of new files and/or file changes which need to be backed up. Every online solution I’ve seen would end up being ridiculously expensive at these sizes and given that my Comcast internet upstream is less than 1Mb/s, it’s really not practical if I need to move a lot of data, which is more often than not. Given all of that I’ve ruled out online backups until they bring fiber into my neighborhood or the cost of the services come way down. So I’m left with simply scaling up the backup scheme and using multiple traditional drives. The system I ended up going with is laid out like this:
1. Main data drive: A RAID5 array with three 1TB drives. This is the main drive that I work from and where I store all of the work. RAID5 uses rotating parity so that even if one of the drives experiences a failure a copy of all your data can be rebuilt from the two remaining drives. Reading and writing data from/to a RAID5 array is also much faster than a single drive (sort of like a redundant version of RAID0 – more info here) so it’s a nice bonus to have this as the working drive.
2. Local backup drive: One 2TB drive which is mirrored from the main drive every night. I use Backup Magic to do the mirroring. It’s light weight, powerful, and best of all: it only runs when I tell it to. I don’t like automatic backup apps that run in the background, they always tend to overstep their bounds and eat up system resources.
3. Off-site backup drive: One 2TB drive in a hotswap SATA bay (similar to this). The plan is to pop this in every week or so, mirror from the main data drive and then take it back off-site for safe keeping. Even if both local drives fail or my house explodes or something, at least I don’t lose my entire life’s work.
Just a note: This backup scheme is for my PC, on my Macbook Pro I use time machine to backup to a single external drive but the problem is that there’s no redundancy. If both drives fail, you’re screwed.
It’s easy to forget that as computer based creatives, everything we’ve ever done, all of our intellectual property, is sitting in a little metal box and there are a lot of things that can go wrong with that box. Regular backups are a must and off-site backups are highly recommended. I know the system I’m using isn’t foolproof — I guess nothing really is — but I feel a lot more secure knowing the data exists on three drives in two separate locations. How about you, what system do you use to backup? For all you Mac users, is Time Machine enough for you or do you have a secondary system in place? Anybody using an online backup solution? (and if so, what size is your data?) Let us know the comments