Crash and Burn

  • Comments posted to this topic are about the item Crash and Burn

  • Heh... I absolutely agree... all the backups in the world mean squat unless you can restore them. The DBA's a couple of jobs ago went through that. Took 40 people ten days to almost get all the data back. It was a huge mess.

    I noticed that Jeff lost the data on the hosting provider... the more I hear about things like this, the more I'm inclined to say keep the cloud... there's too much weather there.

    😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Excellent point on checking ability to restore.

    I started as IS manager at a company some years back and asked what they had for backup and restore procedures.

    The VP of Sales (also their "Techie") promptly informed that they knew they had good backups because they hadn't gotten any error messages on their backup runs for over 6 months.

    This was particularly odd to me when I noticed the dust on the backup tape that was in the drive. It looked like it had been in the machine 6 months itself.

    A check in the backup logs indicated that last successful run was also over 6 months in the past, due to a dead drive.

    We quickly replaced the drive, did backups, and tested restores.

    I still haven't figured out how they managed to be so lucky that they made it 6 months without someone wanting at least a minor file restored.

  • Trust no one but you with your stuff. You are utlimately responsible.

  • skjoldtc (12/29/2009)


    Trust no one but you with your stuff.

    When it comes to my data, I find it hard to even trust myself.

    Trust, but verify...

  • Agreed. Also, practice makes perfect. I perform restores on a regular basis. Not just for rehearsal, but also to refine DR recovery procedures and to confirm that a copy of the latest backup is not corrupt.

  • In two of the presentations I regularly give, I emphasize the importance of maintaining backups that are restorable, including the need for storing backups offsite and regularly checking to see if backups can be restored. Often, when I say this during presentations, many members of the audience roll their eyes, as if saying to me, "yeah, duh, that's obvious." The reason I include such "obvious" advice in these presentations is because some DBAs don't put this obvious advice into action, and I want to provide a little reminder to them. So, if you attend one of my presentations in the future, and you hear this advice from me, please stand up and shout, "great advice man", so that those DBAs that roll their eyes will know that I'm not the only one who believes in the importance of DBA basics.

    Brad M. McGehee
    DBA

  • Last year we had a full test of our disaster recovery plan (always comforting to be part of a company that has one). Mostly went well except that the disk segment replication to our DR site ended up with unsync'd filegroups for our primary db server so we couldn't do the restore, meaning a complete failure in the end.

    If it was a real disaster, we would have had to resort to tape backups taking our DR recovery from 3 hours to over 48 hours, and having some data loss.

    Always good to make sure a restore works.

  • I've encountered the rude awakening a few times myself. Once, at a company that had no DBA, just "the SQL boy," a manager issued a DROP TABLE statement by accident on a database in bulk-logged recovery mode. The transaction logs had never been backed up, and were 300GB on 20GB of data. We soon realized that in order to restore last week's backups (because they took so long they were only done on weekends), the 100GB free was insufficient, and needed some creative space management and several days of data transfers on a severely overloaded and badly designed system. They ended up regenerating the missing data over a three week period because it was faster. They didn't lose ALL their customers...

  • Excellent, excellent, excellent post. Nothing is more important than verifying backups, except, verifying that you know how to run a restore. You're absolutely right when you say that backups are no good unless you can restore them, but it goes beyond actually validating that the backup files themselves are valid and accessible. You need to know that you, and any other DBA's in the organization, can actually run a restore, knows how to read the file header, can do a point in time recovery, etc. Practice restoring databases not only validates that the backups are good, but that you're good as well.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • It's amazing how people manage their data. More than half of the companies I work with have never tried a restore, even of their primary application. Often things are in such disarray that it takes a complete restructuring of the backup process just to get a restore to work. It's common to see full backups failing, the wrong recovery model on databases, and more. In most of these cases the company as no DBA, just an someone in IT who knows a little more about SQL than the rest of the staff.

    Steve's post reminds me of the days when I spent a lot of time interviewing people for developer and DBA jobs within a large organization. During the interview, they would point me to a web site or blog they were very proud to have created. You won't believe how often these sites were down or broken! The moral of the story is when you use the cloud make sure your hosting company has standards as high as your own when caring for your data.

    LinkedIn - http://www.linkedin.com/in/carlosbossy
    Blog - http://www.carlosbossy.com
    Follow me - @carlosbossy

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply