Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Regarding Cluster Failover Expand / Collapse
Author
Message
Posted Tuesday, November 20, 2012 8:44 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, February 24, 2014 1:04 PM
Points: 383, Visits: 2,351
Hi All,

I was troubleshooting a connectivity issue for a cluster server.

SQL 2k5 on Windows 2k3.

Found an instance is down, manually brought back online. However, I am wondering howcome it didn't failover to another node rather than being failed.

Does the failover only happens when the node is down?

can someone advise me how to check why it didn't failedover/

Thanks,
SueTons.


Regards,
SQLisAwe5oMe.
Post #1386968
Posted Tuesday, November 20, 2012 9:50 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 5:32 AM
Points: 1,256, Visits: 3,510
have you checked the windows logs? I'm sure something would be in there if a failover was attempted but couldn't succeed for some reason.

______________________________________________________________________________________________
Forum posting etiquette. Get your answers faster.
Post #1387015
Posted Tuesday, November 20, 2012 9:56 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, April 01, 2014 3:26 PM
Points: 316, Visits: 1,497
check c:\windows\system32\cluster\cluster.log



Post #1387019
Posted Tuesday, November 20, 2012 9:58 AM
Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Wednesday, March 05, 2014 7:37 AM
Points: 790, Visits: 353
Within the Configuration are both servers possible owners of the resources within the SQL Resource Group? Also are there any errors being reported within the windows event logs that indicate there was an issue when trying to failover to the secondary Node and was the Cluster Service also running on the Passive Node

The failover mechanism uses a "looks alive and IsAlive check. The "looks alive" check takes place every 5 seconds on the host node within the failover cluster. Whereas the a more in depth check called the IsAlive check takes place every 60 seconds using SELECT @@SERVERNAME.

Post #1387020
Posted Tuesday, November 20, 2012 11:13 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 11:51 AM
Points: 5,958, Visits: 12,839
SQLCrazyCertified (11/20/2012)
Hi All,

I was troubleshooting a connectivity issue for a cluster server.

SQL 2k5 on Windows 2k3.

Found an instance is down, manually brought back online. However, I am wondering howcome it didn't failover to another node rather than being failed.

Does the failover only happens when the node is down?

can someone advise me how to check why it didn't failedover/

Thanks,
SueTons.

The default generally is to try to restart locally, if this is unsuccessful try 3 times on the partner. If this is unsuccessful the group goes offline.

Check the event logs but also, with the group offline move it to the partner node if its not already.
Now try to bring the resources online one at a time in this order


  • Network IP

  • Network name

  • Disk resources

  • Sql server service

  • Sql server agent service



Look out for any resources that fail, most common is the disk resources not failing properly from one node to another.

Btw in future it would help if you post in the correct forum


-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs"
Post #1387073
Posted Tuesday, November 20, 2012 11:33 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Monday, February 24, 2014 1:04 PM
Points: 383, Visits: 2,351
From eventvwr, I see this....
"A fatal error occurred while reading the input stream from the network. The session will be terminated."

I also see another error every 4 mins or so with different SPID#.
"The client was unable to reuse the session with SPID 243, which had been reset for connection pooling. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message."

Another error
"Error -2147023545 - Configuration information could not be read from the domain controller, either because the machine is unavailable, or access has been denied."

Couple of warninings as this
"The configuration information of the performance library "C:\WINNT\system32\sqlctr90.dll" for the "MSSQL$InstanceName" service does not match the trusted performance library information stored in the registry. The functions in this library will not be treated as trusted."

These are some of the repetitive errors/warnings that I see and cannot make much sense out of it.

Anyway, the instance was manually brought online and working properly, however, I was trying to find the root cause of it.

If you guys can make much more sense out of the errors above let me know.

Thansk,
SueTons.



Regards,
SQLisAwe5oMe.
Post #1387079
Posted Tuesday, November 20, 2012 8:20 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, March 31, 2014 3:28 PM
Points: 49, Visits: 312
"A fatal error occurred while reading the input stream from the network. The session will be terminated" - This is a network adapter error, most probably a driver issue.
To determine the current status of TCP Chimney Offload,
Use administrative credentials to open a command prompt and run the following command

netsh int tcp show global
Post #1387213
Posted Monday, November 26, 2012 3:55 PM
Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: 2 days ago @ 5:18 PM
Points: 504, Visits: 1,465
Are there any messages suggesting a problem with "Heartbeat"? Or network connectivity isseus on the passive node in general? Heartbeat is a special network connection between the two nodes used to monitor. each other. If the passive node was having netwrk issues while the active node crashed, that would explain what occurred.
Post #1388882
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse