Worst Practice - Detailed Disaster Plans

, 2004-11-09

Many people will read the title above and think I'm crazy. You have to plan for disasters,right? You need detailed plans and procedures to ensure that you can put the infrastructure back together and get the business moving, right? 8 out of 10 businesses fail after a major disaster because of IT, right? Or at least some other very scary statistic like this.

While a lot of businesses may fail after a disaster, it's certainly not because they haven't done detailed practicing or their plan or even that they don't have a detailed recovery plan in place. It's more likely other issues and I'll lay out the argument next on why I don't think that you need a detailed recovery plan in place. First, I do think you should have something and I've written a few articles on the Incident Response - The Framework and Responding to an Incident.

Stop and think for a minute about what changes you've made to your infrastructure over the last two months. Patches, configuration changes, software added, new enhancements or bug fixes you've deployed, new servers. There's a huge list of things that you probably change every month. Now stop and take a minute and think of how many are documented. How many times when you changed something did you update a document somewhere the reflects the change. A diagram, notes, anything. Even a simple text or Excel based log?

Chances are relatively few updates have occurred. Now how many times did you think about it or do you suspect the person making the changes thought about documenting them? Probably very few. I've worked in large companies where it was policy that changes had to be documented and submitted to a Change Management group before the changes could be approved. Let me tell you, while more things were documented than in many places, lots slipped through the cracks. In fact, the best documentation I've ever seen was when I manager 4 people and could scream at them for any change not documented. Even better than when I was responsible for deployment and documentation :).

See people hate documenting everything. Not just developers, admins hate it worse since most of them can't type that well. Now extend that to any size of group and it gets more and more difficult to manage. Chance are very high that any plan you do develop will be out of date as soon as it gets published. What's more, it won't get updated regularly and will always be out of date. Therefore if you depend on that plan to get you through a disaster, somehow all your IT guys and girls are gone and you want to give this plan to a consultant to get you up and running, you'll surely fail. Possibly worse than if you just let someone run with no knowledge.

The other problem with having too detailed a plan is the distribution problem. Obviously, at least I hope it's obvious, you don't want to keep your DR plan on the network. At least not the only copy. You'll need at least one copy offsite, probably want a few of them, which means that any updates to the plan need to be distributed. The more items you include in a plan, the more likely that you'll leave some of them out. Especially if you are trying to keep up with the pace of change in any shop. It will far exceed your ability to keep up with it.

And once again, if you are planning on a comprehensive, complete plan, you'll be out of luck.

Having a detailed recovery plan, order of servers to recover, exact configuration, detailed plans for setting up an application, is a worthy goal, but one that's impossible to keep up to date. I'm not advocating ignoring all documentation, in fact the opposite, I'd advocate as much documentation as you can have. But for a disaster, have a general outline and depend on your people. Which means making sure they are dependable, which implies treating them as the valuable PEOPLE they are. Your employees, the ones that keep your company going, are not "resources", or "assets", or "knowledge workers". They're people and they have the ability and more importantly, the pride to get things back up and do a great job. Plan for the outline of what to do, expect things to go wrong, and trust your people to think on their feet.

Most of all, when things don't go well, don't blame them or get angry. Instead, repoint them in the direction of what you need and let them run with it.

©dkranch.net 2004

Return to Steve Jones' home





Related content

Why bother with backup?

Backing up SQL Server data is like many of the things we do because we figure we need to. It is good for you, like eating a good diet and getting exercise. Unfortunately, folks are often about as successful with SQL Server backups as they are with diet and exercise.

This is the first in a series of articles covering SQL Server database backup. The series starts from the very basics of why database backup is important. The question of why to backup a database can inform many other decisions.


2,411 reads

Incident Response - The Framework

Do you have a SQL Server disaster plan? What about something less than a disaster? Steve Jones has worked more than his share of disasters or incidents, some of them self-inflicted. He's taken some of his experience and started a new series looking at a framework for dealing with incidents. Read part 1 about getting prepared.


6,359 reads

Backup Scenarios for successful SQL Server Restores and Recovery

SQL Server has a great backup and recovery architecture, but you have to know how to properly configure and use the server to ensure that you will not be seeking new employment anytime soon. A few of the Sonasoft team have written this short piece on strategies for setting up your backup jobs to ensure recovery in the event of a disaster. Welcome new authors Bilal Ahmed, Kiran Kumar, and Vas Srinivasan.

4 (2)


17,574 reads

Save Yourself - Recovering from an XP Disaster

How many of you dig into Windows XP extensively? Working with hardware and the OS isn't something that many DBAs deal with these days. Most companies have an admin to work on servers, hardware, workstations, etc. But sometimes you need to help yourself out. Steve Jones had to work to get his laptop back after an XP disaster. Read on and hopefully this will help you one day recover your system.


5,868 reads

Mini Disaster - AC Failure

Are you prepared to handle a full or partial failure of your AC system? Ever thought about what would happen if it did happen. Once again Andy offers comments on a real world incident. While we'd all like to think it will never happen to us, we think sharing these incidents is a great way to prevent it from happening to others.


6,109 reads