SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Regarding Cluster Failover


Regarding Cluster Failover

Author
Message
SQLisAwE5OmE
SQLisAwE5OmE
Hall of Fame
Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)

Group: General Forum Members
Points: 3154 Visits: 3075
Hi All,

I was troubleshooting a connectivity issue for a cluster server.

SQL 2k5 on Windows 2k3.

Found an instance is down, manually brought back online. However, I am wondering howcome it didn't failover to another node rather than being failed.

Does the failover only happens when the node is down?

can someone advise me how to check why it didn't failedover/

Thanks,
SueTons.

Regards,
SQLisAwe5oMe.
calvo
calvo
Hall of Fame
Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)Hall of Fame (3.4K reputation)

Group: General Forum Members
Points: 3384 Visits: 4018
have you checked the windows logs? I'm sure something would be in there if a failover was attempted but couldn't succeed for some reason.

______________________________________________________________________________________________
Forum posting etiquette. Get your answers faster.
SpringTownDBA
SpringTownDBA
SSCommitted
SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)SSCommitted (1.5K reputation)

Group: General Forum Members
Points: 1540 Visits: 1499
check c:\windows\system32\cluster\cluster.log
hs24
hs24
SSC Eights!
SSC Eights! (948 reputation)SSC Eights! (948 reputation)SSC Eights! (948 reputation)SSC Eights! (948 reputation)SSC Eights! (948 reputation)SSC Eights! (948 reputation)SSC Eights! (948 reputation)SSC Eights! (948 reputation)

Group: General Forum Members
Points: 948 Visits: 374
Within the Configuration are both servers possible owners of the resources within the SQL Resource Group? Also are there any errors being reported within the windows event logs that indicate there was an issue when trying to failover to the secondary Node and was the Cluster Service also running on the Passive Node

The failover mechanism uses a "looks alive and IsAlive check. The "looks alive" check takes place every 5 seconds on the host node within the failover cluster. Whereas the a more in depth check called the IsAlive check takes place every 60 seconds using SELECT @@SERVERNAME.
Perry Whittle
Perry Whittle
SSC Guru
SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)SSC Guru (55K reputation)

Group: General Forum Members
Points: 55545 Visits: 17709
SQLCrazyCertified (11/20/2012)
Hi All,

I was troubleshooting a connectivity issue for a cluster server.

SQL 2k5 on Windows 2k3.

Found an instance is down, manually brought back online. However, I am wondering howcome it didn't failover to another node rather than being failed.

Does the failover only happens when the node is down?

can someone advise me how to check why it didn't failedover/

Thanks,
SueTons.

The default generally is to try to restart locally, if this is unsuccessful try 3 times on the partner. If this is unsuccessful the group goes offline.

Check the event logs but also, with the group offline move it to the partner node if its not already.
Now try to bring the resources online one at a time in this order


  • Network IP

  • Network name

  • Disk resources

  • Sql server service

  • Sql server agent service



Look out for any resources that fail, most common is the disk resources not failing properly from one node to another.

Btw in future it would help if you post in the correct forum

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" ;-)
SQLisAwE5OmE
SQLisAwE5OmE
Hall of Fame
Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)Hall of Fame (3.2K reputation)

Group: General Forum Members
Points: 3154 Visits: 3075
From eventvwr, I see this....
"A fatal error occurred while reading the input stream from the network. The session will be terminated."

I also see another error every 4 mins or so with different SPID#.
"The client was unable to reuse the session with SPID 243, which had been reset for connection pooling. This error may have been caused by an earlier operation failing. Check the error logs for failed operations immediately before this error message."

Another error
"Error -2147023545 - Configuration information could not be read from the domain controller, either because the machine is unavailable, or access has been denied."

Couple of warninings as this
"The configuration information of the performance library "C:\WINNT\system32\sqlctr90.dll" for the "MSSQL$InstanceName" service does not match the trusted performance library information stored in the registry. The functions in this library will not be treated as trusted."

These are some of the repetitive errors/warnings that I see and cannot make much sense out of it.

Anyway, the instance was manually brought online and working properly, however, I was trying to find the root cause of it.

If you guys can make much more sense out of the errors above let me know.

Thansk,
SueTons.

Regards,
SQLisAwe5oMe.
SQL!$@w$0ME
SQL!$@w$0ME
SSCrazy
SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)SSCrazy (2.5K reputation)

Group: General Forum Members
Points: 2474 Visits: 1258
"A fatal error occurred while reading the input stream from the network. The session will be terminated" - This is a network adapter error, most probably a driver issue.
To determine the current status of TCP Chimney Offload,
Use administrative credentials to open a command prompt and run the following command

netsh int tcp show global
dan-572483
dan-572483
Hall of Fame
Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)

Group: General Forum Members
Points: 3681 Visits: 2000
Are there any messages suggesting a problem with "Heartbeat"? Or network connectivity isseus on the passive node in general? Heartbeat is a special network connection between the two nodes used to monitor. each other. If the passive node was having netwrk issues while the active node crashed, that would explain what occurred.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search