Blog Post

About those backups...


So I went to present at Charleston PASS on Thursday night and of course I had my standard contact information on my slides. This included the web site I have (had) as my brochure site: (it's not up, so no link). I hadn't checked it in a while because I hadn't actively updated it. It needed to be updated, but there were other priorities in front of it.

One of the gentlemen who attended my talk emailed me asking for the slides and also included that didn't seem to be up. I checked it remotely, or at least, as best as I could from a state park where you go mainly to get away (meaning low to no cell signal). Server time outs. Of course, I was getting that just trying to hit Google. So I made a note to check it once I got back to "civilization." I did and... same error.

The site was hosted by a friend who at the time had excess capacity and graciously offered hosting. I understood the deal: everything would be best effort, but you get what you pay for. So Monday I ask him about it. His response, "Remember the SAN crash?" I vaguely do. He had sent an email and I had scanned through it, but didn't think much of it because he's always the type to nail his backups. He's paranoid like that.

He had a couple of drives up and go bad at the same time on the same LUN, thereby taking out that LUN. He had backup jobs for that LUN, but since the LUN was for auxiliary stuff (friends' websites, personal projects, etc.) he hadn't checked the backup in a while. The production LUNs for paying customers was a completely different story. Needless to say, was on that "auxiliary" LUN.

In my case, it's no big deal. It was merely a brochure site that I've had a task to update going on 10 months now, which indicates its low priority. I had considered just starting over because the last time I touched the site was probably around 3 years ago. It wasn't exactly a modern marvel of technology or anywhere near up-to-date on me and what I'm doing. This makes that decision easier.

However, it reminded that if you're depending on someone else for your hosting, you need to make sure you have a backup of your site. Years ago, an organization I was with was using a hosting firm for some of the sites that were less critical to that organization. The hosting company's web servers were infected and they were forced to go to backups to recover. However, they hadn't tested their backups in months. Some of the restores took the sites back 6 months, meaning all changes since then had to be re-deployed. Thankfully, the folks who maintained those websites on our side always insisted on keeping local copies of the sites up-to-date. So in our case, our anticipation of an issue meant we were able to bring the sites back up to current relatively quickly. Other customers were not so fortunate. Yes, the firm had backups. Yes, the firm had an SLA on data loss. But the reality was they busted it and if we didn't have a local copy, we'd have been burned. An SLA allows you to recover monetary damages. It doesn't put your web site back.

So the moral is, whether you're talking databases, file systems, or web sites, make sure that if you are in control of your backups, you verify them. If you aren't, figure out a way to make your own backups that you verify those backups.