Keep Cool Under Pressure
It's late one night, well not too late, but it feels late. After a full day or work and then another half day with the kids, sports practice, cooking dinner, a little leisure time with them and getting them tucked in, it feels late even if it's only 9:30 at night. I work at home for SQLServerCentral.com and this gives me a great flexible schedule, so I find myself working at all hours of the day and night sometimes.
The newsletter hasn't gone out yet as I've been concentrating on a bunch of editing all day and so I sit down to get it ready to go. As I do, I realize that a new advertisement came in earlier and I need to load it up and get it ready for release. One of the things that we do is replace all vendor URLs with a redirect to track the clicks for them. Some do it on their end and some don't, but we grab them all to ensure we can report some stats to them. That's one of the things that keeps this site going and my paycheck coming in.
So I replace the URLs, load them into our redirector, generate a newsletter after typing my daily editorial nonsense, and run a test. I start checking URLs in the test newsletter and realize that one doesn't work. I go back and check the original copy and sure enough the vendor sent me a newsletter link that doesn't exist. That's kind of funny, but since it's late, I just want to fix it. Since everything is in our redirector, it's a simple UPDATE to fix the URL. So I type (names changed to protect the innocent)
set urlvalue = 'http://www.vendor.com/correcturl'
where vendorID = 999
It's late, I'm trying to get done as I still have a bunch of emails to process and get complete and so I put the cursor at the first line and do my usual "Hold-down-SHIFT-downarrow-twice" and hit CTRL-E and get ...
(420 rows affected)
But before I even register that, which is the double confirmation for me, I notice that the first two lines are highlighted, but not the last. I've just performed the most common and deadly "DBA Whoops!"
I know, I know, you're all applauding, probably laughing and hopefully haven't just spit coffee all over the monitor. Actually that might be better since then you couldn't read any further. Immediately I realize the implications of what I've done. All past newsletters, the archives, all rendered useless with the redirector since I've updated all URLs to the same value.
Not what I need on any day, much less a late night when I'm trying to get finished.
Whether or not this has happened to you, or whether you've performed some other less than stellar DBA miracle, the important thing here is to keep cool. Now I've done this particular mistake quite a few times, but I've also gotten experience with many other types of stupid mistakes. Usually of my own doing, but often not, so despite the heart rate racing and the temptation to panic or jump up and down, the important thing is to remain calm and work through things.
Now this is a 800MB database, so it's not huge, but not small. The important thing here is to get the data back ASAP. While there may be many ways to do this, the easiest to me is a backup and restore. Not of the database, but just this table.
Immediately I connect to the server, create a 1MB database, and then kick off a restore of last night's backup. This table rarely changes, so that works for me. If it had changed more recently, I'd be following either Kathi's or Grant's instructions, but here a simple restore will work.
The restore happens quickly and I then check the table in the new database. It's got all but the last three rows, so I do a cross database update between the old and new databases on the primary key to fix the URLs. Fortunately two of the three rows are the value that I still have in another query and the last is correct as it's the one that started this mess.
Generate the newsletter again, check it and go. Add in a few tests of other previous newsletters to be sure I've fixed things and I'm off to bed. After a few more emails.
The most important lesson is the one that I've already mentioned. Keep cool. It doesn't help to panic, it clouds your thinking and can prevent you from getting out of the mess. It can also lead you to compound the problem, so take care and think things through. Whether you've set every book in the Barnes and Noble catalog to $7.99 or you've wiped out the entire order history with your employer, it's done. The really important thing is to take a deep breath and relax.
Easier said than done, but trust me you need to do this. The next step is to assess the damage. Quickly. You should know what the impact of what you've done is. After all, it's your database and you know the data model, right? Maybe not, this might be one of hundreds you manage and someone told you to run a query. So assess the damage and its implication. Quickly.
Once you know what you've done, the next thing would be to fix it, right?
NO. The next step is to inform your manager or whoever is in charge of this. It's tempting to try and cover things up, to try and get by without being noticed, but I've been through this myself and watched lots of others. Coming clean is always the best thing you can do. Especially if something else goes wrong. The reason you should inform someone is threefold. One it's the right and professional thing to do. Two, if there are serious implications, then you can get people moving to deal with them immediately. And three, you move the non-technical part of a crisis to someone else. Let the managers earn their money and deal with everyone outside your group. Step three is inform someone. In case you're wondering, no I didn't inform Andy or Brian as it was late in Denver, so they would have been asleep and there wasn't anything they could do. If I hadn't fixed it, I probably would have called. Plus, I'm in charge :)
Now we fix it, right? No. Before you fix something, get a plan together. Stop and think about what is needed to undo your mistake. In my case, I actually spent a couple minutes running through the process of the restore, considering what I'd need to do if I didn't have the data in the backup. I went over the changes for the day and knew that I'd restore, check the data, then do a cross database update. I knew which rows I'd added and knew that I'd be digging back through emails if there were more missing rows for the URLs that I needed. This was a simple problem, but in more complex situations, it is really helpful, especially when there is more than one person involved, to have a plan that everyone agrees on and knows how to execute. Make a plan.
Now the step that everyone is waiting for. Fix the problem by executing your plan. Hopefully the plan is a good one and doesn't compound the problem. That will happen sometimes and then go back to step 1 and start again. Possibly with more help, PSS, consultants, etc., but go slow and make sure things work.
I've had CEOs breathing down my neck, people literally screaming at me over the cost of lost business at times, and I've let it affect my work. Maybe I've still quickly fixed the problem, but it shook me for awhile and made me less effective for a long time afterwards. No matter how critical the mistake is, no matter how expensive, I guarantee that fixing it twice because you moved too fast will cost more, both in dollars and your reputation, than moving a little slower and making sure you do everything you can the best way you know how.
It's also tempting to make one other tweak, or fix something else, or try to make your plan better on the fly. DON'T DO THAT. Just like you shouldn't make more than one change when testing or benchmarking, don't fall into temptation and do something extra. Fix that you need to fix and leave it at that. Compounding a problem is the worse thing you can do.
Worse than making the mistake in the first place.
Step by Step
- Keep Cool
- Assess the damage and implication
- Inform Someone
- Make a Plan
- Follow the plan