Server 2008 Enterprise x64 SP2
SQL SERVER 2005 SP3 Enterprise x64as anyone
Windows 2008 and SQL Clustered
At the end of last summer we completed a DR exercise.
The exercise involved physically segregating a portion of our offsite network
and restoring copies of the Infrastructure ie) Active Directory, installing copies of Windows Clusters 2003 and 2008 and SQL Servers 2005 and performing some testing.
After testing and while breakiong down this DR environment a task was performed out of step.
The network line connecting our live environment and the DR environment was re-established
before the DR Active directories were destroyed. Because the DR domain is essentially a copy of our live domain the DR AD controller replicated with our live environment.
Trouble ensued. Accounts and objects were over written.
Most of the damage was easily reversed in the case of user accounts but Windows
2008 Clustered SQL Servers were a different story.
Because the DR servers were copies of the real ones, the physical computer objects, the cluster object and the sql server virtual name objects all were over written with the DR copy.
The first eveidence of a problem was that when remoting into the physical node a message stating the node was no longer trusted appeared.
Secondly, Upon failing the SQL instance over to the other node(non affected node), the SQL virtual name would not come on line rendering the instance useless. Thirdly upon failure or restart of the cluster service it failed to come online as well rendering all instances even those not in the
DR test useless.
The problem seemed to be the AD computer objects (SID?) no longer matched what the Machines and clusters were expecting and had no access to them.
The Windows 2003 Clusters did not face this problem as they do not seem to use these objects in the same way as Server 2008 clustering.
After some time and troubleshooting with Microsoft the only proposed solution was to restore active directory.
Our team did not feel comfortable with this option and ended up building a new cluster and clearing out the AD objects where I then reinstalled the 4 SQL instances that were affected.
Has any one faced this similar issue before? Is there an easier way to fix this?
The AD objects seem like an achilles heal to Windows 2008 Clustering. Windows 2003 did not have this problem.
Thanks for any comments.