Blog Post

T-SQL Tuesday #98 – Your Technical Challenges Conquered – Corrupt Database in System Table

,

T-SQL TuesdayThis T-SQL Tuesday is brought to us by Arun Sirpal (b | t) and wants to write about a technical challenge we have conquered.  Having been a DBA for 18 years now it’s hard to pick on a technical challenge but I will stick to the more recent challenge of recovering a corrupt database.

The particular challenge comes with a story.  Imagine driving six hours to nice snowing destination for ski trip while leaving your coworkers (mind you five of them) behind to do maintenance to apply a CU for all 140+ production SQL servers.  You are on this nice, short 2.5-day vacation too because the doctor says to take a break your migraines need some rest. But your coworkers reach out to after being up for over 36+ hours with one server that just keeps rebooting itself randomly after being up a short period.  Here comes the challenging part you are still not in your hotel room so you are working on the mountain to start within the Starbucks (good times) and you are suffering from the migraines you have had since November and here it is March.

So, it’s not obvious at first way the server is just rebooting itself at first. The team has tried creating a new server and restoring all the databases on to the new server. FAIL. So, it comes down to then there must be something wrong with one of the databases.  You go ahead and open a support ticket with Microsoft and start running CHECKDBs one by one the databases.  The management team is taking databases offline and bringing them online, the server is spitting out memory dumps that you give Microsoft support.  So you start working on the problem at 3:00ish PM in the afternoon and tell your coworkers you can take it from here they can get some sleep, they eventually sign off.  At some point your room becomes available and you migrate there and the Internet connection there is slower than in Starbucks.

Microsoft support contacts you back to start running CHECKDBs and stays up all night with switching people with you at some point.  Microsoft Mark has I called him since I work with a Mark and have a son named Mark was the got the all mightier.  You communicating back and forth with your direct manager via Slack and via the company in Hangouts and as you get one CHECKDB finished having to get permission to bring another database online to run CHECKDB against it.  I think the hardest part was just answering the question is do you know when the time frame is for when having the database live. Sometime around 1 AM you finally out of 40+ databases find the bad database and the corrupt table is system table.  Oh boy, you tell yourself how to fix this.

Great news for us we keep schema only copies of our databases.  You restore a copy of the schema only database to the server that uses solely for these purposes and your restore the corrupt database.  Next challenge how to copy the data into the database, Microsoft support to the rescue.  They give their Azure data migration tool to you but it requires Visual Studio, which you only have on your laptop and not on a box a work.  You fire it up and start running it takes hours, as soon as it starts running you ask someone install Visual Studio on a jumpbox at work so we can make the process faster.  Once that is installed you fire it off again and you have the database copied in 30 minutes.  Then you restore the newly minted database to the server without any data loss and earn part of your Database Superhero title. The best part is doing all of this actually gets rid of your migraines.  The worse part you were up all night so you spend your ski vacation sleeping beside the fireplace and just eating at restaurants with your friend and after waking up your migraines are back.  But two important lessons run CHECKDB more often and SQLing helps with migraines.

Takeaways from this story.  When doing maintenance have spare people, do run CHECKDB often, keep schema only copies of your databases, and have a spare server to do restores on.

 

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating