Today we have a guest editorial as Steve is away on holiday.
"By seeking and blundering we learn." – Goethe
We all make the occasional mistake, although hopefully not as public or humiliating as ordering $20bn worth of trains that were too wide for many regional platforms, or the recent security blunder that led to the passport data of the England soccer team being posted on social media.
So, while I agree with Goethe wholeheartedly, as a DBA and guardian of an organization's data, I feel compelled to add "But try to do it on a VM Development Box and not in Production".
Some DBAs, especially early in their career, adopt a rather gung-ho attitude to many of their tasks. Urgent job? Fire up SSMS and do it! Needless to say, it leads to mistakes, often breaking completely whatever faltering code they were attempting to "fix". No-one is immune from the temptation to cut corners and save time, especially when under pressure. I remember, early in my career, making a few mistakes, such as the time I accidentally deleted an important database backup. Fortunately, no harm came of that one, but I recall to this day the heart stopping moments between hitting that delete/enter key, and being able to verify that I had another copy of the backup.
What's important is how we deal with these mistakes and what we learn from them. According to Andy McNab, in his book Bravo2Zero, "Check & Test" is a far more realistic motto for the SAS than "Who Dares Wins". I learned quickly that this is equally true for DBAs.
I've adopted the "Check & Test" motto, and it is a procedure I implement every day, for every task, and with every client. These days, I use the "Create script" task in SSMS constantly. If I step through a task in the UI, I examine the underlying script, make sure I understand its logic exactly, test that it does exactly what I expect and exactly what is required. Only then do I deploy it. Check and test.
Recently, a line manager asked me to archive parts of a 300GB database. He was sitting next to me and I was eager to help, so I walked through the archive process, and almost started it immediately. However, past experience stayed my finger on the start button, and instead I quizzed the manager about recent backups. It turns out there hadn't been one for 30 days, as it was "too large a backup". Again, check and test.
I've recently acquired a small addiction to playing Chess. It's a game that punishes mistakes severely and therefore makes one constantly examine a proposed action before doing it (Am I missing something? What are the consequences of this action? What's my fallback plan?). Likewise, it's no bad thing to question yourself or ask a colleague to peer review your script or procedure, and thankfully it's a "check" that I see more and more of my fellow developers and DBAs use. Also, when applying changes to any system (not just Production), always make sure you know your fallback plan, in case something goes awry.
Of course, a lot of what I have described is common sense but it is surprising how many teams don't seem to implement even such simple procedures. It isn't possible to eliminate mistakes entirely, especially when working to tight deadlines, and under pressure. However with "Check & Test", you can minimize the potential for mistakes, and be much better prepared to deal with those that slip through.
Scott Crosby (Guest Editor)