Another in our series of things you should avoid at all costs. Seems silly at first, not having a detailed recovery plan for your SQL Server. Perhaps it is, but having dealt with any number of problems over the last few years, Steve Jones has some ideas why a detailed plan may not be the best thing to spend your time on.
From the DBA’s perspective, a Disaster Recovery solution may be daunting. There are a lot of pieces of the puzzle to apply to have a successful DR solution. It’s important to be involved from the beginning of the project, and help to understand and define the scope.
This is my version of a DR project that was handed to me one afternoon as an afterthought. I was tasked with getting the main production instance of each of your 15+ clients up on a DR site within a few weeks. The first one showed up at my workspace and wanted to be able to test the DR solution that afternoon; I panicked. No work had been done yet. Emails had been sent out asking for a DR solution, that’s all at that point.
Feeling backed into a corner, I started spouting out database jargon and list of things that needed to be gathered, prior to a successful test. I started saying that we needed to get to the Business Owner a list of DTS packages, Jobs and Users that are wanted to be in existence for this DR database. We obviously had to copy the database down to the DR server, since nothing was setup to replicate or any other HA solution currently in place. So, I hurriedly created spreadsheets for all the DTS packages, Jobs, and Users, and forwarded them onto the Business Owners with the plea that they let me know which ones of each were imperative for this solution, from their perspective. I placed myself at their disposal as a Subject Matter Expert to assist in any of the discovery necessary. I started the process of copying down too our DR server in Mexico, thru a small bandwidth pipe that was already overloaded. The first database for the first client in the queue was on its way to a fully functional DR site. I was so excited…. While I was waiting for the database backup to complete its copy process, the Business Owner stopped by and asked if we could test the site now? I laughed… explained, and continued working.
Once the database was copied down and restored, I went back the Business Owner, and asked for the answers to the DTS, Job and User info I had previously requested. He had not looked at the email, and had no idea what I was requesting. After a bit more discussion, it was further realized that this was a first attempt at a DR site. A very simple implementation of DR would occur. A basic copy of the database, a couple DTS packages, some users, and a copy of the website, with some images of documents. We were to plug this website into the db, and login, pull up on document, see the image, and call it good. A single IP address would be used as we rolled each client into the DR site, and the websites would not be functional beyond the initial test.
A few things were still learned, even for this simple version of a DR site. I will enumerate those items here, in the hopes that they will help others that attempt even the simplest DR implementation.
- Copy backups of databases to DR server
- Ensure that you actually have valid backups being processed on Production server.
- Determine the proper time frame when it is acceptable to copy file down to the DR server.
- Ensure that there is enough bandwidth to copy the backups.
- Realize how backups are performed, 3rd party applications vs standard backup. Make sure that you have the same process on DR side
- Configuration/Installation of DR System
- SQL Server needs to be installed on DR site
- Configuration of SQL Server on DR site
- Permissions granted on DR site
- Information gathering
- Run scripts to gather users
- Run scripts to gather jobs and info about jobs
- Run script to gather DTS information
- Present this information to business owners in an understanding way
- Review information and decide on subset of items to implement
- Implement the information gathered in DR site
- Restore the database
- Add the users, and/or attach the orphaned users
- Create any needed jobs
- Copy any DTS packages needed
- Alter any configurations inside the DB that are hardcoded to a server. (Don’t laugh. Make sure you look for them)
- Testing system
- Be ready and able to dig into the system for configurations, hard coded issues, and be nimble enough to alter them to get it working
- Be available when testing occurs, make this a priority
With the DBA, Network Admins, Business Analysts, Production Support individuals, Business Owners and other’s help, most of the hurdles you will see will be overcome to allow the system to function properly enough to be tested as a DR instance. You may want to emphasis the importance and timeframe needed to get each client up and running. We ran into problems where the Business Owner was expecting a quicker response than we were prepared to provide. This friction did not go over well, and caused more issues. Other clients were more laid back, and allowed deviation and slowness to occur. We initially didn’t think of this project as a priority, and felt overwhelmed with our day jobs, along with this added task. When we reprioritized this project, things became much smoother. I wish these concerns had been brought up to the group at the beginning and resolved up front.
When you attempt to create your first DR site, make sure you plan well. Get buy-in on the list of tasks and expected results. Get all the parties involved that need to be involved to succeed. Plan, plan and plan. Make room for errors and resolution time. I hope that your first, second or whatever attempt at a DR site is more successful than the last one.