Cluster configuration corrupted

  • We have SQL Server 2008 SP1 running a single node cluster SQL 2008 R2 SP2. We've had multiple attempts to install Cumulative Update 3. When I rebooted this morning, I noticed that the cluster disks initially showed Failed in Cluster Manager. I stared the SQL upgrade anytay, thinking that if SQL service was off, it woulsn't matter. Minutes later I was able to start the disks manually, but the update still failed.

    After unintalling the SQL CU and rebooting, ALL of the cluster resources showed offline. was able to manually turn on the Network Name & Disks, but left SQL off. The SQL upgrade again failed, with Details.txt containning the following:

    2013-04-30 07:16:02 Slp: Taking cluster group 'SQL Server (MSSQLSERVER)' offline

    2013-04-30 07:16:04 Slp: Configuration action failed for feature SQL_Engine_Core_Inst during timing ShutdownNonInstance and scenario ShutdownNonInstance.

    2013-04-30 07:16:04 Slp: The cluster group 'SQL Server (MSSQLSERVER)' could not be taken offline. Error: The group or resource is not in the correct state to perform the requested operation. (Exception from HRESULT: 0x8007139F)

    2013-04-30 07:16:04 Slp: The configuration failure category of current exception is ConfigurationFailure

    2013-04-30 07:16:04 Slp: Configuration action failed for feature SQL_Engine_Core_Inst during timing ShutdownNonInstance and scenario ShutdownNonInstance.

    2013-04-30 07:16:04 Slp: Microsoft.SqlServer.Configuration.Cluster.ClusterException: The cluster group 'SQL Server (MSSQLSERVER)' could not be taken offline. Error: The group or resource is not in the correct state to perform the requested operation. (Exception from HRESULT: 0x8007139F) ---> System.Runtime.InteropServices.COMException (0x8007139F): The group or resource is not in the correct state to perform the requested operation. (Exception from HRESULT: 0x8007139F)

    After uninstalling & rebooting again, I though I would attempt the upgrade with the SQL Service running, it would fail to start via Cluster Manager, saying "The module cannot be found." I showed this to our SAN guy, and he started SQL via the Services Applet. The Next CU install failed again.

    Now The SQL Server Item in Cluster Manager is inoperative. It will not start SQl or load the properties tab (Error loading properties for this item). The second tab reads "The specified node does not support a resource of this type. This may be due to version inconsistancies or the absence of the resource dll on this node."

    SQL Failed to start from SQL Configuraiton Manager but did start (as did SQL Agent) from the Windows Services window. Connections can be made through the Clustered Network Name & IP and users are back in the apps.

    Naturally we'd like to fix this without breaking it more first. Since the Clustered resources were built by the SQL Installation Wizard, we're reluctant to delete & rebuild it. The SAN guy recomends to a Repair from SQL Setup, but this looks like a Cluster issue to me. We could also run the Cluster

    What's the best approach at this point with the least risk of breaking it further?

  • One question springs to mind

    If you had a resource in your cluster group marked as failed why on earth would you think the update would still be successful??

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • I was hoping to update the binaries on the C: drive, then deal with the SAN drives issue afterward.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply