Availability Group in resolving state

  • This morning one of our databases was not accessible. After investigation, I found that the Always On Availability Group was in "RESOLVING" state. I was able to bring it back online manually via cluster manager. And I knew there were possibly network connectivity issues during the night and in the morning that could have resulted a few minutes of down time here or there. By checking the log, I also found that the very last incident caused the AG to be put in "RESOLVING" state. Prior to this one, there were few other instances where the the server had brought the AG back online automatically. What I was puzzled about was why it's in "RESOLVING" while the AG has not been setup for an automatic fail-over.  In my case, the always on heath log shows a few sets of status that go from PRIMARY_NORMAL to RESOLVING_NORMAL to PRIMARY_PENDING to PRIMARY_NORMAL. It also shows errors about lease expiration. Does this mean that if the lease expires, regardless of the fail-over setting, the server will initiate a fail-over attempt? 

    The reason for the lease expiration perhaps was due to "sp_server_diagnostics" not returning the result quickly enough. There were a few entries in the SQLDIAG such as "[hadrag] Availability Group is not healthy with given HealthCheckTimeout and FailureConditionLevel" and "[hadrag] Failure detected, diagnostics heartbeat is lost" and "[hadrag] ODBC Error: [HYT00] [Microsoft][SQL Server Native Client 11.0]Query timeout expired (0)". Are there any more places that I should look to find out more about what could have caused these errors? I checked cluster log, SQl log, event log. They pretty much show the same thing: "...The availability group database [dbname] is changing roles from "PRIMARY" to "RESOLVING" because the mirroring session or availability group failed over due to role synchronization. This is an informational message only. No user action is required.." 

    Thanks in advance for any help!

  • There is a MS blog that has helped me before in working though something similar and it explains some of the things you are asking about.
    Seems to be worth checking out for your situation: 
    Diagnose Unexpected Failover or Availability Group in RESOLVING State

    Sue

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply