Failover

Question

Failover

Steve Jones - SSC Editor

SSC Guru

Points: 734434
More actions
August 5, 2016 at 12:05 am

#315953

Comments posted to this topic are about the item Failover

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

David.Poole SSC Guru Points: 75896 More actions · Answer 1

I remember our production team arguing a split brain situation in a mirrored server where the support engineer kept insisting it was impossible and the team said It's right here in front of us!

I worry more about data centre failovers. Database cluster/mirror failover is an out-of-the-box configuration activity. Data centre failover is subject to the skills and resources that your organisation put behind the work to build it. Bi-directional failover of datacentres is complicated. It's one thing to fail over to a datacentre that might be slightly behind in syncing up transactions but It's another thing entirely to reconcile, merge and fail back.

LinkedIn Profile

Gary Varga SSC Guru Points: 82166 More actions · Answer 2

It seems obvious to me that most would want a configurable solution with defaults provided out of the box. Of course, we would hope for better defaults than the new database ones.

Gaz

-- Stop your grinnin' and drop your linen...they're everywhere!!!

thewoodymax SSC Enthusiast Points: 154 More actions · Answer 3

Steve,

Tuning a database environment too make it efficient and reliable is a balancing act and one of the great arts of a good DBA. Your commentary on " Failover" made a very good point that I believe every DBA should consider concerning failovers. In the end of your commentary, you compared the situation of having a system constantly failing, thus forcing a failover, as probably being worse that not failing over in an emergency. While having a system that is constantly failing-over is a problematic by causing delays for the users, I believe a delay in work should have far less of an impact then to have a system completely shutdown because it did not failover when needed. My question is how does one ensure that the system will failover when there is an emergency and still ensure that the system failover does not occur when their is system "hiccup"?

Steve Jones - SSC Editor SSC Guru Points: 734434 More actions · Answer 4

Jeff Torres (8/7/2016)
My question is how does one ensure that the system will failover when there is an emergency and still ensure that the system failover does not occur when their is system "hiccup"?

No idea. There should be ways to delay, or perhaps limit the failovers i some way. If you're having some network issues, it might be better to have users experience some delays on the primary than failing back and forth to a secondary.

It depends on your setup and the situation, but I've seen places where clustering is failing back and forth because of some network or other non-client issue, and this results in client downtime, and at times, someone just removing a node or two from the cluster so clients can work.