Bringing them back from the dead

, 2018-09-10

Diagnosing “dead” SQL Server instances

Several times in the past, I’ve had to deal with massive outages where tens or even hundreds of SQL Server instances have gone down and only a portion of those came back online. When trying to find the causes for why the SQL Server service doesn’t start, one of the best places to start is the SQL Server Error log.

So, what is the problem?

There is a drawback when the SQL Server Service is not running: the normal methods of displaying and analysing the Error Logs are not available. The option then is to use a manual method that means that one must locate the Error Log directory, open all the log files in a text editor, and go through each line. That can be quite a task for a single instance, let alone having to do this for many servers. To mitigate this, I have written a simple batch file and compiled a collection of search pattern files. The batch file searches for the error log directory and then parses each file, searching for error entries for each severity level by matching the search pattern files.

Having done this work and knowing that others might face the same problem, I decided to share my solution with this quick write-up.

Compatibility

An SQL Server is like a box of chocolates; you’ll never know what version or architecture you’re going to get. Hence, for this task, the best option is to use DOS/CMD commands as that does not require any additional tools or libraries being installed on the server.

Of those commands, the most helpful for this purpose are DIR, FINDSTR, and the control flow, IF. The FINDSTR is particularly useful, as one can compile the search patterns in a file and then pass the file name as a parameter.

FINDSTR has one drawback, which is that if the file being search is a Unicode file, then no match will be found unless one appends every character with the period, “.”, character in the search pattern file. Depending on the SQL Server version, the Error Log files can either be Unicode or ASCII text files, therefore in the effort of creating one-size-fits-all, two pattern files are used for each error level, one in ASCII format, here the file name of SEVERITY_LEVEL[NN].txt being prefixed with WINNT and the Unicode files simply called SEVERITY_LEVEL[NN].txt.

How it works

When invoked with the drive letter of the drive to search, the batch file simply recursively iterates through all directories and sub-directories of the drive until it locates the ERRORLOG directory, then it does a pattern matching on all the log files in that directory, very simple. Since this is not done in SQL, I’m not going to go any further into the actual code.

Usage

The prime requirement is to have an Administrative access to the system being diagnosed.

The first step is to upload/copy the zip archive and expand it on the target system. Since there are some older systems which do not have a command line unzip programmes installed, I normally do this from either a USB key with the directory uncompressed or from a network share.

The next step is to open a command prompt, navigate to the directory containing the TSSQLEX.bat file and then run it with the with the desired parameter / switches. To get a list of applicable switches, run the batch file without any parameters or switches.

A nifty little thing in the batch file is the /O switch, which writes the output to a file in the batch file directory. This means that one does not have to scrape the command prompt output to collect the results.

Syntax

TSSQLELX.bat [Drive letter] [/S] [/P] [/V] [/H] [/O]

Parameters:

The first parameter must either be a single character drive letter of the drive to search or the /H switch, if no arguments are passed, the syntax is printed.

Switches:

S Skip the functionality tests

The functionality test asserts the search functionality by looking for a pattern that appears in all error log files.

P Print the individual commands before executing

Useful for repeating searches for individual error level by copying the command and re-run it from the command prompt.

V Verbose, print additional information for each error level.

Prints the error level description text from BOL.

H prints this full help message.

The message can be found in the README.txt file in the batch file's directory.

O Writes the output to a file called COMPUTERNAME.USERDNSDOMAIN_ TSSQLEX.txt.

The file is created or overwritten in the batch file's directory.

Resources

Rate

5 (3)

Share

Share

Rate

5 (3)

Related content

Worst Practice - Detailed Disaster Plans

Another in our series of things you should avoid at all costs. Seems silly at first, not having a detailed recovery plan for your SQL Server. Perhaps it is, but having dealt with any number of problems over the last few years, Steve Jones has some ideas why a detailed plan may not be the best thing to spend your time on.

2004-11-09

7,917 reads

Backup Scenarios for successful SQL Server Restores and Recovery

SQL Server has a great backup and recovery architecture, but you have to know how to properly configure and use the server to ensure that you will not be seeking new employment anytime soon. A few of the Sonasoft team have written this short piece on strategies for setting up your backup jobs to ensure recovery in the event of a disaster. Welcome new authors Bilal Ahmed, Kiran Kumar, and Vas Srinivasan.

4 (2)

2004-06-29

17,574 reads

Save Yourself - Recovering from an XP Disaster

How many of you dig into Windows XP extensively? Working with hardware and the OS isn't something that many DBAs deal with these days. Most companies have an admin to work on servers, hardware, workstations, etc. But sometimes you need to help yourself out. Steve Jones had to work to get his laptop back after an XP disaster. Read on and hopefully this will help you one day recover your system.

2004-06-23

5,868 reads

Mini Disaster - AC Failure

Are you prepared to handle a full or partial failure of your AC system? Ever thought about what would happen if it did happen. Once again Andy offers comments on a real world incident. While we'd all like to think it will never happen to us, we think sharing these incidents is a great way to prevent it from happening to others.

2004-06-07

6,109 reads

Streamlining the Database Server Recovery Process

Are you tired of manually restoring each database on a new server when the original server has a melt down? Does the manual process seem slow, and prone to keystoke and mouse click errors? Would you like to have those restore scripts automatically built, so you only have to fire them off? Well this article will show you one possible method for speeding up and reducing errors will trying to perform a restore of all databases on a server.

5 (2)

2002-11-05

8,972 reads