Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

The Voice of the DBA

Steve Jones is the editor of SQLServerCentral.com and visits a wide variety of data related topics in his daily editorial. Steve has spent years working as a DBA and general purpose Windows administrator, primarily working with SQL Server since it was ported from Sybase in 1990. You can follow Steve on Twitter at twitter.com/way0utwest

SQL Server Disasters

I joined in late to a conference call this morning from Quest where a number of their experts were talking about various disaster stories that they'd experienced over the years. It's great to hear real DBAs talking about the problems and challenges they've actually experienced, and it should give those that haven't some confidence that they can learn to deal with one.

One of the presenters said that you'd be guaranteed to have a disaster at some point in your career. At first I thought, sure, that's probably true and I've had a few of my own, but then I started thinking about all the DBAs that get interviewed, that don't know about restoring to a point in time, that never restore lots, just fulls, and that have jobs and are successful.

SQL Server gets more stable all the time. Hardware becomes more redundant, disk drives detect errors before they occur, CPUs are manufactured better, I wonder if everyone will have a disaster. And what's a disaster?

Is it when someone deletes data? Is it when a patch doesn't work and you have to uninstall it? I guess those as disasters, but they're minor ones. Rolling back an upgrade, patch, or application change might involve some of the skills that you use in an unexpected disaster, but I'm not sure I think that's what I'd consider a disaster. More an incident to me.

I do think, however, that you should be prepared, and if you haven't practiced any of those skills involved in recovering, you should. Everyone should

  • restore a full and multiple logs as a test at least once a quarter, and preferably once a month.
  • Restore to a point in time once a quarter, just to practice the skill.
  • Track the version/build levels for your servers on a regular basis, preferably every day or week with some automation.
  • Know your vendor support numbers, product keys, and other administrative details necessary for recovering your servers.

And probably more importantly, keep your resume up to date. If a disaster strikes, hopefully you'll be able to solve it and get things fixed. However you never know when you'll take the blame for it, whether you are at fault or not. And you don't know if your company will survive.

As my son has learned in scouting, "Be Prepared" for most anything that comes your way.

Comments

Posted by Dugi on 20 March 2009

Short and nice post here, Steve!

Thnx for little information about Server Disaster!

Posted by Phil Factor on 23 March 2009

Another reason for doing regular restores is to double-check that the backups haven't somehow got mangled. Yes it happened once to me.

Posted by rudy komacsar on 24 March 2009

Your success as a DBA lies entirly in one thing in this type of situation. One would think that this is - "How good was your last restore". The true question is - "How good is your present restore".

Posted by Murali Krishnan on 25 March 2009

A Nice piece of Info.

Posted by Richard Howells on 25 March 2009

Hi Steve - ALL the things you mention are just incidents if you can recover from them.  The ONLY time you have a disaster is if you:

a) cannot recover

b) your only recover option ends up losing a load of data

c) recovery takes so long that the business loses a load of money waiting for you.  

So your probability of incident becoming disaster is linked to the number of different options you have for restoring, and the speed you can do it without screwing up.

Hence I like your suggestions for PRACTICING the skills.

Posted by Ed Salva on 25 March 2009

we got some extra practice yesterday having to rebuild the server, restore an image, and then do the restore on an older sql 5 server. Fortunately we had the spare equipment to do this.

Posted by mario.oliveira on 25 March 2009

Better safe than sorrow.

Posted by adityanayak53 on 25 March 2009

Better prevent and prepare than repent and repair.

Posted by bob.willsie on 25 March 2009

The difference between disaster and inconvenience is recovery time.

Calm nerves help.  That's another way the practice pays off.  You know what to do and can tell people to have patience while you do it.

As for taking the time to practice, I am amazed at the number of times over the years I've heard someone say "that will never happen" or "we'll never do that", only to have it happen.  Sometimes within a year, sometimes within seconds.

Posted by eng_mgomaa on 25 March 2009

It is very interesting share from you  Steve, and it remind  me one situation, when my RAID controller in on server was fired and i was really in disaster.After this time i always make Backup and disaster plan. What i want to say that backup and disaster is very important thing and not all of us remember or pay enough attention for that, and we pay attention only when we face real problem but at that time we will not be ready for disaster.  

Posted by justin on 29 March 2009

thanks for the reminder Steve

Posted by daher_omer on 29 March 2009

I think it is just an exercise when you can recover from a disaster but What if the backup fail ?

 What about a backup Hard drive failure ?

Posted by wcterry on 7 April 2009

Great advice there. Keep your resume updated and follow #jobangels, wcterry on twitter.

Leave a Comment

Please register or log in to leave a comment.