Always On Failover Issue

  • Thanks for posting your issue and hopefully someone will answer soon.

    This is an automated bump to increase visibility of your question.

  • Do you know if the heartbeat is using the same network as other data?

    I once had an issue where the database backup was using all of the bandwidth and stopping the heartbeat from working long enough for a failover to occur. Or, if you are running in a virtual environment, another VM guest could have used the bandwidth.

    In terms of whether it's something to worry about. Ideally I think it's worth trying to find out what caused it if possible. It could be something that could repeat or get worse. I think the failover would have disconnected all of the client applications?

  • I do believe the heartbeat is using the same network as everything else.

    I use a monitoring tool as well and at the same time this happened to the cluster, the monitoring tool picked up this -

    Cannot connect to SQL Server instance '######' :

    A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 -

    The semaphore timeout period has expired.) : The semaphore timeout period has expired [121] (requires acknowledgement)

    From what I read this error can relate to Network Adapter issues, so I will ask my Network guys again to have a look at this, I cant see any error message that don't point towards networks.

    • This reply was modified 5 years, 5 months ago by garryha.
  • I've seen issues like this related to VSS-Backups. During these backups the server is halted for a short-term including network cards. In one AG-Environments the primary went in pending state for a couple of seconds but afterwards came back online without a failover. On other days a failover occurred.

Viewing 4 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply