When I heard Allen Kinsel’s (aka @sqlinsaneo|Twitter) Invitation to T-SQL Tuesday #19 – Disasters & Recovery, for some reason, I read it as “Invitation to Disaster” LOL. Well, if you DON’T have a comprehensive DR plan for your IT and server infrastructure, then indeed you have an invitation for disaster!
When I saw the Doppler image of a hurricane, I immediately thought of my own experience with disaster, that fateful day we all know of as “9/11” I was all too close to the “action” that morning, and not going to relive that here today. (I blogged my own personal story on one anniversary of that date September 11 - Never Forget - My story of the tragic day.) Interesting, in light of today’s topic, that Tuesday is the day when this event occurred.
However, in the context of this T-SQL Tuesday, one can argue that 9/11 changed a lot of things!
So, we had Disaster. What was the Recovery?
When I was working downtown then, the company had a second office location in midtown, but far from what one would consider a COLO. In fact, the app/db servers in the DT location had nothing to do with the app/db servers hosted in the MT facility. So, the recovery was a disaster in itself! It was a hectic several weeks following 9/11. We literally spent 24/7 racking, stacking and provisioning hardware and servers, restoring images where possible, and then, finally, re-installing all the SQL Servers, and then restoring all the databases we had from boxes of tape. Even then, I was thankful for SQL Server v.7.0, but there were a couple of v6.5’s out there (which required prior knowledge of the database sizes for each data and log device – ugh!) It took many weeks to fully recovery, and it was a certainty of lost business dollars.
Well, we certainly have come a looong way. Unfortunately, it took a disaster of this magnitude to get businesses thinking about well, business continuity. In the post-9/11 world, with respect to technology, many had to reflect upon Lessons Learned from 9/11, as discussed in the highlighted article.
Part of the recovery of my then company I worked for, was followed by a massive investment in a strategic disaster recovery/high availability/business continuity/colocation plan.
They decided to completely take their infrastructure out of New York (maybe overkill), and place them into two geographically disperse centers as their COLO. (this is smart).
They ended up going with a geo-clustering architecture using a third-party solution. This was before the days of virtualization, where I see this as one excellent strategy that should be included in every company’s consideration for DR. Virtualization provides many advantages to managing database/server infrastructure, among them allowing you to easily move from one server’s hardware to another, even if that move is from one physical locality to another! As part of the DR planning committee at another company, they weren’t quite convinced that virtualization was right for high-transaction production database servers, but in a DR scenario was definitely a win-win situation. We therefore implemented a P2V (physical-to-virtual) deployment, where the physical production servers were housed at a headquarters in NYC, and the colo at a datacenter across the river in New Jersey, where systems were virtualized across blade servers.
In addition to P2V, for our SQL Server 2008 Cluster, I also implemented database mirroring across an MPLS pipe. This provided both protections against hard-ware failure, as well as high-availability, in the event of a data-center disaster. High Availability is a very important part of overall DR business continuity to minimize risk of a disaster impact. High availability is all about the site being accessible all the time, 24x7, and HA solutions minimize the downtime for these mission critical applications.
If you are interested in exploring very real solutions for high availability, there is a very good book that recently came out by SQL MVP Hemantgiri S. Goswami, called "Microsoft SQL Server 2008 High Availability" It discusses Clustering, Mirroring, Log Shipping, Replication and more.
Finally, there are all sorts of disasters that can occur. To the end-client, if the systems are down, they are down. Steve Jones blogged recently on What’s A Disaster, where he lists the various things that can occur to bring your database servers to a halt – from hard-drive crash to hurricane. They do not necessarily need to be on the scale of a 9/11 or even natural weather event like a hurricane. You just need to plan ahead and be able to recover – quickly and painlessly, and hopefully, transparent to the end-user.
That’s my entry on this fine T-SQL Tuesday. Thanks for stopping by Pearl Knows!
Oh, on to tweet on Twitter #TSQL2sDay my entry. Don't forget to follow me on Twitter @PearlKnows!