Failover - AG issue

  • Hi All,
    Issue: We have 2 instances say A01 AND A02.
    The client has network issue sometimes and they loose connection then the availability groups try to fail over to A02 and that fails because its not supposed to fail over to A02. I need to figure out how to "not" have the availability groups fail over to A02 and make AG's stay on A01. The common errors I see in cluster managed is issue with 'file witness server', or 'AG lease timeout'. However everything is normal after the failover event. The issue here is we get the 'down time' for few mins which I would like to stop. Please 'share' your thoughts what can be the permanent solution to this option.
    Suggestions:  friend advised me to 'have the AG or clustering removed' since we know network has issues. Please share your thoughts and help me out. 
    Thank you!

  • You could change to asynchronous with manual failover and take the witness out the AG or work on improving the network configuration to avoid the automatic failovers.

  • Joe Torre - Wednesday, September 26, 2018 3:47 PM

    You could change to asynchronous with manual failover and take the witness out the AG or work on improving the network configuration to avoid the automatic failovers.

    Thank you for your suggestion!

  • sizal0234 - Wednesday, September 26, 2018 11:58 AM

    Hi All,
    Issue: We have 2 instances say A01 AND A02.
    The client has network issue sometimes and they loose connection then the availability groups try to fail over to A02 and that fails because its not supposed to fail over to A02. I need to figure out how to "not" have the availability groups fail over to A02 and make AG's stay on A01. The common errors I see in cluster managed is issue with 'file witness server', or 'AG lease timeout'. However everything is normal after the failover event. The issue here is we get the 'down time' for few mins which I would like to stop. Please 'share' your thoughts what can be the permanent solution to this option.
    Suggestions:  friend advised me to 'have the AG or clustering removed' since we know network has issues. Please share your thoughts and help me out. 
    Thank you!

    can you provide more detail on the network setup between the cluster nodes, is this a virtual system or physical?

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Thanks...it is virtual.

  • sizal0234 - Wednesday, October 3, 2018 2:45 PM

    Thanks...it is virtual.

    lease timeout is the isalive check timing out.
    Increase the response times for the mirror endpoint and the resource healthcheck.
    test the systems to ascertain what level of response time you are getting over the virtual network

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • This was removed by the editor as SPAM

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply