Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

cluster Failover Expand / Collapse
Author
Message
Posted Tuesday, October 19, 2010 12:12 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Wednesday, August 27, 2014 8:25 PM
Points: 150, Visits: 400
Hi

Server : Windows server 2008

DB Server : SQL Server 2008 (SP1)

first of all,I went through all the logs, and could not find the reason for fail-over initialization. There should be some thing logged why the failover happened? secondly after failover the service was not coming online due to duplicate IP address detection. later when we try to manually bring the service online from cluster management it comes online successfully. i dont understand how would duplicate IP address get resolved when we start manually.

Lastly we see few errors related to physical disk resource between failover retries, is this could be the correlated to failover error ? Please help to troubleshoot these errors, i am not so good at clustering and Thanks for your help in advance....:)


Here are the series of events which happened.

1.) Event ID: 1135

Cluster node 'XYZ' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

2.) Event ID: 1049

Cluster IP address resource 'SQL IP Address 1 (XYZ)' cannot be brought online because a duplicate IP address '10.9.8.113' was detected on the network. Please ensure all IP addresses are unique.

3.) Event ID: 1069

Cluster resource 'SQL IP Address 1 (XYZ)' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.

4.) Event ID: 1049

Cluster IP address resource 'Cluster IP Address' cannot be brought online because a duplicate IP address '10.9.8.112' was detected on the network. Please ensure all IP addresses are unique.

5.) Event ID: 1069

Cluster resource 'Cluster IP Address' in clustered service or application 'Cluster Group' failed.

6.) Event ID: 1066

Cluster disk resource 'Cluster Disk 25' indicates corruption for volume '\\?\Volume{88552e6f-aea2-11df-9790-0026b92fffa7}'. Chkdsk is being run to repair problems. The disk will be unavailable until Chkdsk completes. Chkdsk
output will be logged to file 'C:\Windows\Cluster\Reports\ChkDsk_ResCluster Disk 25_Disk16Part1.log'. Chkdsk may also write information to the Application Event Log.

7.) Event ID : 1066

Cluster disk resource 'Cluster Disk 26' indicates corruption for volume '\\?\Volume{88552e05-aea2-11df-9790-0026b92fffa7}'. Chkdsk is being run to repair problems. The disk will be unavailable until Chkdsk completes. Chkdsk
output will be logged to file 'C:\Windows\Cluster\Reports\ChkDsk_ResCluster Disk 26_Disk4Part1.log'. Chkdsk may also write information to the Application Event Log.

8.) Event ID: 1049

(Same message as point 2)

9.) Event ID: 1069

(Same message as point 3)

10.) Event ID : 1049

(same message as point 4)

11.) Event ID :1069

(same message as point 5)

12.) Event ID :1205

The Cluster service failed to bring clustered service or application 'Cluster Group' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.

13.) Event ID: 1069

Cluster resource 'Cluster Disk 17' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.

14.) Event D : 1049

(same message as point 2)

15.) Event ID: 1069

Cluster resource 'SQL IP Address 1 (XYZ)' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.

16.) Event ID : 1205

The Cluster service failed to bring clustered service or application 'SQL Server (MSSQLSERVER)' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.




Thanks

Mushtaq
Post #1007216
Posted Wednesday, October 20, 2010 2:10 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 7:13 PM
Points: 6,466, Visits: 13,919
mushtaq777 (10/19/2010)
Cluster node 'XYZ' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

Have you checked all network hardware to make sure there are no issues here. Last time I had a failover on one of my clusters someone had made a VLAN change to the port on the switch servicing the active node!


mushtaq777 (10/19/2010)
2.) Event ID: 1049

Cluster IP address resource 'SQL IP Address 1 (XYZ)' cannot be brought online because a duplicate IP address '10.9.8.113' was detected on the network. Please ensure all IP addresses are unique.

This means what it says, who assigns IP addresses in your enterprise, get them to confirm the IP for your clustered instance! Take the virtual IP offline and then try to ping it to see if it still replies!


mushtaq777 (10/19/2010)
3.) Event ID: 1069

Cluster resource 'SQL IP Address 1 (XYZ)' in clustered service or application 'SQL Server (MSSQLSERVER)' failed.

4.) Event ID: 1049

Cluster IP address resource 'Cluster IP Address' cannot be brought online because a duplicate IP address '10.9.8.112' was detected on the network. Please ensure all IP addresses are unique.

5.) Event ID: 1069

Cluster resource 'Cluster IP Address' in clustered service or application 'Cluster Group' failed.

These are all related to the virtual IP issue


mushtaq777 (10/19/2010)
6.) Event ID: 1066

Cluster disk resource 'Cluster Disk 25' indicates corruption for volume '\\?\Volume{88552e6f-aea2-11df-9790-0026b92fffa7}'. Chkdsk is being run to repair problems. The disk will be unavailable until Chkdsk completes. Chkdsk
output will be logged to file 'C:\Windows\Cluster\Reports\ChkDsk_ResCluster Disk 25_Disk16Part1.log'. Chkdsk may also write information to the Application Event Log.

7.) Event ID : 1066

Cluster disk resource 'Cluster Disk 26' indicates corruption for volume '\\?\Volume{88552e05-aea2-11df-9790-0026b92fffa7}'. Chkdsk is being run to repair problems. The disk will be unavailable until Chkdsk completes. Chkdsk
output will be logged to file 'C:\Windows\Cluster\Reports\ChkDsk_ResCluster Disk 26_Disk4Part1.log'. Chkdsk may also write information to the Application Event Log.

Have you tried running CHKDSK on these disks as the message suggests?


-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs"
Post #1008010
Posted Friday, September 27, 2013 7:58 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Saturday, October 18, 2014 7:51 PM
Points: 118, Visits: 868
Did you resolve the issues? We have the same errors.

Cluster node 'ABC' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists, check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges
Post #1499390
Posted Monday, April 7, 2014 7:03 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, September 2, 2014 10:53 AM
Points: 1, Visits: 58
Were you able to resolve this issue? We are also seeing these issues..
Post #1559045
Posted Monday, April 7, 2014 7:08 AM


SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Saturday, October 18, 2014 7:51 PM
Points: 118, Visits: 868
SQL server 2012 AlwaysOn with windows 2008 R2 has so many issues with Clustering. We rebuilt our servers with windows 2012 data center and sql server2012 AlwaysOn. Also, set our data,log,temp drives to independent disk mode in Vmware.
Post #1559051
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse