One Node in Cluster has completely failed and we need to rebuild the cluster

  • One node in our cluster had a hardware failure and comepletely down and need to rebuild. I am asking for advices/suggestions to get back the node and the cluster working.

    1.the OS is Windows 2008 R2 Enterprises

    2.SQL version is SQL 2008

    They are asking if we want to

    a) completely rebuild the server ---if we go down this path, what do i need to do to get the SQL cluster working

    b) restore the server from an earlier backup--Will this restore mess up the SQL Cluster settings or functions?

    Please advise...thank you in advance

  • Found this KB article from Microsoft,

    http://msdn.microsoft.com/en-us/library/ms181075(v=sql.100).aspx

    anyone having any issues by following the steps in there?

  • Just to add a check point

    Once the cluster node is built and added to the MSCS cluster. Please validate the cluster setup.

    After installing SQL Server, Test the cluster by performing failover.

  • DBA in Unit 7 (1/8/2013)


    One node in our cluster had a hardware failure and comepletely down and need to rebuild. I am asking for advices/suggestions to get back the node and the cluster working.

    1.the OS is Windows 2008 R2 Enterprises

    2.SQL version is SQL 2008

    They are asking if we want to

    a) completely rebuild the server ---if we go down this path, what do i need to do to get the SQL cluster working

    b) restore the server from an earlier backup--Will this restore mess up the SQL Cluster settings or functions?

    Please advise...thank you in advance

    How recent is the system state backup of the failed node?

    If you dont have a recent backup i would evict the failed node and perform a clean OS install to the new hardware. Add the node to the cluster and then re install any clustered applications.

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Perry Whittle (1/9/2013)


    DBA in Unit 7 (1/8/2013)


    One node in our cluster had a hardware failure and comepletely down and need to rebuild. I am asking for advices/suggestions to get back the node and the cluster working.

    1.the OS is Windows 2008 R2 Enterprises

    2.SQL version is SQL 2008

    They are asking if we want to

    a) completely rebuild the server ---if we go down this path, what do i need to do to get the SQL cluster working

    b) restore the server from an earlier backup--Will this restore mess up the SQL Cluster settings or functions?

    Please advise...thank you in advance

    How recent is the system state backup of the failed node?

    If you dont have a recent backup i would evict the failed node and perform a clean OS install to the new hardware. Add the node to the cluster and then re install any clustered applications.

    +1 from me. Its cleaner in my opinion.

  • Thanks for all the responses above.

    and now that the cluster is back to shape and i would like to report back what was done.

    1. Yes, the failed node was evicted from the cluster and did a Clean OS install on the new hardware

    2. the Windows Cluster was reestablished and handed us ( our host service provider handles our OS and below)

    --one thing I would like to add here though, make sure your Surviving node( the one having the SQL running) will NOT fail over to the new built NODE.

    3. once I got the Windows cluster, i did run a Cluster validation report and it went well.

    4. The I installed the SQL Server 2008 ( by choosing the adding new node),I had to run the installation by running from command line though. it went smoothly. 🙂

    5. installed SP3

    6. Installed other stuff (Oracle client for the linked server, and some other backup/monitoring utilities)

    failover and fail back, things work as expected.

    Yeah, we are being lucky, i think, this incident didn't create much drama. thanks for all the above inputs again.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply