A Simple DR Solution

  • Comments posted to this topic are about the item A Simple DR Solution

  • yes DR is an interesting exercise. below are some some points from my own experience.

    o management do not understand time constraints for items such as bandwidth limitations, data volumes, restore times, etc. explain these things in their language and back them up with understandable metrics.

    o we were fortunate enough to have separate funding for this project. hence we were able to hire an independent consultant to prepare initial architectural recommendations. specifically, this addressed required bandwidth and the choice of replication technology, i.e. backup only [full/diff], log shipping, replication, mirroring.

    o engage a project manager, at least for the initial exercise.

    o document everything and keep it safe and accessible.

    o test your documented procedures using alternative staff, if possible. i.e. the person writing the failover guides should not be the person testing the failover capability.

    o time each task in your test sequence and communicate the results.

    o script everything and keep these scripts under version control. our log shipping is set up entirely from scripts. this is much faster than 'point and click' and can be easily automated or transferred to other DBA staff.

    o keep things up to date. new customers/applications come on board. don't assume that someone has added their databases to the DR site or updated the documentation. have an appropriate person in charge of this, preferably with a backup.

    i'm sure there are others. i'll add anything else that comes to mind.

  • One product that I have found to be very useful for this type of DR is Microsofts Data Protection Manager. It does DB and file backups and is designed for low bandwidth scenarios.

    http://www.microsoft.com/systemcenter/dataprotectionmanager/en/us/default.aspx

  • Good article...:)

  • Nice work! I would like to have seen a bit more technical stuff though, as to how you're going to maintain the copies across the network, using replication, log shipping etc. Seems good for a prototype but what did you do next?

  • Just use doubletake. If they say they can't afford it tell them they can't afford not to. Just realize the effort this guy put into all of his simple DR drill and the time wasted could have paid for a license.

    http://www.doubletake.com/

  • we've been using EMC's SRDF for years now. we ship the data on a disk level from one SAN to another.

    problem now is we have x64 at the main site and the DR servers are so old that our laptops are more powerful. no one wants to write a check to buy servers that will sit there for months or years without any use

  • Client from hell, huh?

    Do everything you're doing, and plan and implement a DR scenario during lunch.

    A lot depends on what the BO really meant by "DR". FTPing database backups to a remote site may be enough for some.

    Another approach would be to explain how much he CAN have in a day or a week. Sometimes the true cost will mitigate requirements. That gets SOME protection in fast, and then s/he can decide if the cost of more is worth it.

    Frankly, doing much more than FTPing the file systems with the database dumps is going to cost enough in labor that it'll be cheaper in the long run to license something like XOSoft.

    But if the BO's expectations are off by an order of magnitude, and he won't hear otherwise, it's probably time to fire the client.

    Roger L Reid

  • The article is much too simplistic in regard to changing all the places where the previous server name is hardcoded. That process is an absolute nightmare. We actually ran into a situation where we did a DR to a test server, changed the server name in our jobs everyplace we could think of (msdb, job steps, maint plans, etc.). Still we found that the jobs were running on the original server. Microsoft said it's a reported bug but there's no solution yet. LOL (meaning lots of luck!)!

  • Nice article. However, I had to come up with something like this as well and choose mirrorring because the client wanted to be able to swtich the app as part of DR and not have to shut everything down - and did not have the budget to cluster.

    Another client asked for P2P replication as part of their DR, the problem is, they want to use it as a fail-safe as well as a way to load balance the app. Its up and running though (on some very hi- end servers 🙂 )

  • Very good article for a basic intro to DR and some of the pitfalls that you could encounter. You're describing a higher level of DR than we're employing at this time, though we're working towards a higher level. Unfortunately right now all we have is our backups going to a building a mile away. If we lost the server room to a fire or whatever, we wouldn't have much data loss, though there'd be a LOT of purchase orders issued the next day for new hardware!

    Fortunately our ERP system SFTP's itself to the vendor on the east coast every night, so that system can still be productive if a disaster happens. They were able to cut paychecks for a city hall that was wiped off the map by Katrina and couriered them to the nearest city with an open bank. It may not be perfect DR, but it's good enough.

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

  • for our apps we use db's and tables that are populated with usernames, server, db, login and password

    app starts up and checks the db and table for the credentials it will use to access sql. this way all we have to do is make a few changes to point apps to another server

  • Thanks for a clear and concise article. It made me realise that there is some work to do here! The backups are good but the jobs, DTS packages and logins aren't very well covered.

    Nicole Bowman

    Nothing is forever.

  • Nicole Bowman (7/31/2008)


    Thanks for a clear and concise article. It made me realise that there is some work to do here! The backups are good but the jobs, DTS packages and logins aren't very well covered.

    Jobs and DTS packages are stored in MSDB: you are backing up your system databases, aren't you? 😀 Under 2000, you can also script out your jobs, very handy for copying standardized maintenance (DBCCs, system DB backups) between servers. You can script them out in 2005, but in 2000, you can do them all at once. Or at least if you can do them all at once in 2005, I haven't found it yet.

    As a part of my EOM/first of the month processing, I script out all databases, and then in separate runs, I script out the logins and jobs. If an object accidentally gets deleted, it gives me a fallback. If we had a CVS system, I'd be checking it in there so I could run deltas between months and see if anyone is mucking with my systems that I don't know about.

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

  • I have worked on large and small DR projects for SQL Server. The very best/coolest was using SRDF/A on Symmetrix and Windows 2003 Cluster. The smallest being transactional replication. SRDF technology had my disks waiting for me at the DR Data Center 1200 miles away over DS3 and all I needed to do was script the unlocking of the disks, running of two batch files and the cluadmin up the cluster and we were live in 5 minutes 🙂

    No matter what type though, there is one thing for certain: the business generally has a very good clue about what they want but they can't describe it, and my job has been to edumicate them. If I don't get them up to speed, who will? Even some server admins who work on advanced projects and are certainly not ignorant dudes, miss the boat on SQL Server DR. It's funny to joke around about, but it won't be when "it" happens. And I've seen "it" at least once with a data center fire that cost a few million but didn't kill the company (25B, large undisclosed company).

    One of the biggest misconceptions is the living state of the mdf/log files and the fact that flat file backups of the NT system don't cut it for recovery. Many folks believe they are covered because they backup the server OS, including all the data volumes and log volumes that SQL Server lives on. Not the case. While flat file backups or .BK files that are caught in the NT backup will recover the server to a point-in-time, that may not be sufficient for what the business needs. Worse, if the NT backup is missing the window, you may be a day late (NT 8PM, SQL 10PM etc..).

    My approach has been to tackle it at a high level as such:

    1. Determine the Business Recovery Requirement

    2. Explain Existing Recovery Capabilities to biz

    3. Explain Existing Risk to biz

    4. Match Requirement to Existing Recovery Capability

    5. Have biz sign-off on the documented risk level they are willing to accept

    6. Implement a disaster recovery that will CYA beyond the minimum the business signed-off on

    7. Pray that you aren't the guy who tells the bad story of his DR setup failing and who is looking for a job in this forum.

    And lastly, remember to be the guy bugging your boss about disk, DR, backups and your needs. Gripe and fuss and do it in e-mail on the phone, through chat. That way, when you saved the day by the hair of your chinny-chin-chin, you can tell them that having you on staff saved them all the money they would need to have spent, should a DBA who implemented CYA not been hired, on products that actually do guarantee business continuity by hiring you and keeping you employed during an economic slowdown.

    -Cal

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply