Minimising Failover latency in highly transacted enviroment

Question

Minimising Failover latency in highly transacted enviroment

JaybeeSQL

Hall of Fame

Points: 3501
More actions
September 17, 2015 at 4:45 am

#307806

Hi all,
Background - my current client wants to move to clustering, they are on Standard Edition therefore only likely to be able to use 2-node clusters. We have approx. 150k executions/day (9-5), most RW, proposed topology will be N+1, in our case 3 clusters of 2 nodes each, a load balancer splits traffic evenly across 2 clusters, if both nodes fail for a cluster then a load balancer will send requests to the 3rd cluster. All clusters are on the same subnet (I hope to configure a fourth cluster offsite, logshipped, for DR).
I want to help them implement these clusters, and reduce the momentary downtime they would experience if one of the cluster's nodes suffered a failure, and the cluster fails over, to the minimum possible with the above topology. I know with Windows 2K8/2012 you can configure the delay between heartbeats and the number of hearbeats attempted before the cluster deems the node to have failed, and fail over to the other.
Would appreciate advice about delay/# of attempts to a minimum.
Cheers,
JB

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply

Robert vd Berg SSC Eights! Points: 872 More actions · Answer 1

Hello JB,

This is an interesting question! I can't give you an answer on this one. Personally, I'd be more concerned about preventing failure than improving the failover time, and I would be careful not to increase any false positives (by taking corrective action sooner) or extra overhead (by checking more often).

What exactly is the reason you are trying to tweak this? Did you experience failovers that took too long? And doesn't the load balancer offer some sort of protecting against an instance being unavailable?

Robert van den Berg

Freelance DBA
Author of: