Mondays had been good to me recently. Mondays at work are usually quiet, and occasionally I may even be able to sneak in a nap if you want to. Yeah, I have it good. But this Monday would prove to be a bit trickier than normal.
It started out normal enough, I was rushing to work, and hoping for one more day full of naps. As I settled in and logged on to my system, I saw an email from one of the clients. It said he was not able to connect to his SQL Server 2008 instance. Also, he had tried using the SQL Server 2008 Configuration Manager tool to check the SQL Services and received this error:
"Cannot Connect to WMI Provider. You do not have permission or the server is unreachable".
Note: 'WMI' is a tool used for monitoring server and database components.
A couple of tickets were assigned to me with 'urgent' priority, asking me to help resolve the issue as soon as possible.
My dreams were already lying in a gutter, and any sweet nap would have to wait, but this is what DBAs are made for. I put my feet in detective shoes and started to investigate. I logged on to the box and opened SQL Server Management Studio, attempting to connect to the database engine, but immediately received a familiar error:
"A network-related or instance-specific error occurred while establishing a connection to SQL Server.
The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections."
I tried to use the SQL Server Configuration Manager tool, and was greeted with the same error described by the customer. I launched the Windows Event Viewer, began investigating the system and application logs, and was quickly rewarded. Sort of.
The Master database had become corrupted. That was the reason why we could not connect to our SQL Server 2008 instance. By now, any thoughts of a quiet day were all gone, and a flood of email from client started pouring in. I rubbed my hands together, cracked my knuckles, and got straight to work.
The first thing to fix was WMI error
I launched the Windows Services tool (services.msc), clicked on the dependencies tab of WMI (Windows Management Instrumentation service), and took note of the problem. It showed an "Initialization failure".
As this tool falls under our 'Windows Team' category, I contacted an available resource, and they were quickly able to resolve the issue. However, even with the WMI service working properly, I was still unable to connect to, or use anything related to SQL server.
Whenever we manage WMI, or systems that use WMI, We use MOF files. WMI data (such as definitions of namespaces, classes, instances, or providers) are represented in MOF files. As our SQL Server 2008 was not able to Connect to WMI Provider, the next step after fixing WMI was to fix the SQL Server 2008 MOF file "sqlmgmproviderxpsp2up.mof" to be able to use the SQL Server 2008 Configuration Manager. This file (in SQL Server 2008) is generally found in the 'C:\Program Files\Microsoft SQL Server\100\Shared\' folder.
Occasionally, during setup, some MOF files don't get installed and/or registered correctly. There is a program called "mofcomp" that is responsible for registering and storing the data associated with MOF files. If the MOF file information becomes damaged, compromised or was never installed correctly, the problem will result in an error message like the one I saw. I ran the statement below in a Windows command prompt to re-register the "sqlmgmproviderxpsp2up.mof" file in the registry:
mofcomp "C:\Program Files\Microsoft SQL Server\100\Shared\sqlmgmproviderxpsp2up.mof"
This did the trick. I was now able to use the SQL Server 2008 Configuration Manager tool. I quickly updated the client, and he took a few deep breaths (So did I). But this was not the end of the story.
I was now able to use SQL Server 2008 Configuration Manager, but it was showing all the SQL services were currently stopped. As expected, the services would not start due to the corruption in the master database. So our next step was to fix corruption in the master database.
I knew how to fix it in SQL Server 2005. We generally run the statement shown below, using the "REBUILDDATABASE" option in setup.exe and the "/qn" parameter for installation to run silently:
start /wait <CD or DVD Drive>\setup.exe /qn
INSTANCENAME=<InstanceName> REINSTALL=SQL_Engine
REBUILDDATABASE=1 SAPWD=<NewStrongPassword>
I usually like to keep a few e-books handy as a reference in case I may need some help, especially with newer versions of SQL. In this case, I did just that, referencing the SQL Server 2008 e-book I had available (Microsoft SQL Server 2008 R2 Unleashed by Sams publications).
That book explained that the statement to fix a corrupted Master database, is same in both versions (2005 and 2008). I tried the statement (noted above), but found that it didn't work. It drove me crazy to find posts all over the web saying the same thing: the fix is similar to SQL Server 2005. The client's temperature was rising, and my hands were getting cold, as this was a real-time production issue. I had no choice but to keep searching for a solution, and I kept doing so until I finally found an article (link shown below) from Brian Egler: http://www.networkworld.com/community/node/39292
Brian's article made me hopeful, as he had another solution than the one I had been chasing. Taking Brian's advice, I went ahead and ran the statement shown below, (in the context of my server) to recover the corrupted Master database:
setup.exe
/QUIET
/ACTION=REBUILDDATABASE
/INSTANCENAME=instance_name
/SQLSYSADMINACCOUNTS= accounts
[/SAPWD=password]
[/SQLCOLLATION=collation_name]
This command worked wonders! I think the database may have actually smiled back at me!
I was finally able to start the SQL services and connect to the SQL instance. I gave a celebratory fist-pump, and continued on.
We were lucky to have the backups of system and user databases available, otherwise we would have lost the data. So the next step was to restore master database backup. First, I started the SQL instance in single user mode using the following command:
sqlservr.exe -m -s <instancename>.
I restored the latest backup of my master and user databases, bringing the instance to the latest point possible. We were officially golden.
Now, I don't know who Brian Egler is, but the last thing that came to my mind after all this was "I owe this guy some cool gifts for saving my day".
Thank You Brian Egler.
So to summarize, I performed the following steps to facilitate the connection to SQL Server 2008 and restore the instance:
- Fixed WMI provider error with the help of windows team.
- Used the mofcomp utility to re-register sqlmgmproviderxpsp2up.mof file in the registry.
- Fixed the corrupted Master database using the new command of SQL Server 2008 (provided graciously by Brian).
- Restored the latest backup of the master database.
- Restored the latest backups of all user databases.
That was it.
The client's production was back in business, and we were both happy with the outcome.
And to top it all off, I still had time for a nap.