Availability Group databases not syncing

  • Here is our setup:
    Primary
    Read-Only secondary, synchronous mode
    DR secondary, not read-only, asynchronous mode. 
    The DR server is not part of the auto failover, it is not set as a possible over of the AG, and it SHOULD not have been set as a possible owner of the cluster.  It also has no vote in the quorum.
    File share witness is the quorum setup.  

    We had an issue on Saturday, the best thing we can determine so far is that the time sync on the servers caused the failover.  The cluster failed over to the DR node and the AG became not accessible.  The DR node is on a different subnet. In the event of a DR, we need to re-configure the listener because changing the connect strings in the applications is far too much work for development. 

    Anyway, I failed the cluster and AG back to the proper servers. 
    SOME of the databases showed not synchronizing on the read-only secondary.  After re-boots and suspend and resume of data movement, the databases would still not sync. 
    Any kind of process that would attempt to access the non-syncing database would return the error:
    Unable to access database 'XXX' because it lacks a quorum of nodes for high availability. Try the operation again later.

    Some databases work just fine, others would throw this error.  

    I set the AG to async mode, and the DB's are all available. 

    A few google-fu searches came up with nothing specific to this error.  I am stumped.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • When you get "Unable to access database 'XXX' because it lacks a quorum of nodes for high availability. Try the operation again later." it means your servers are not communicating properly.
    what was the issue on Saturday? what's your AG's session timeout?
    i don't think a time sync would cause a fail over.
    if some of your db's were showing not synchronizing that means there was no connection with primary/secondary servers.

    Alex S
  • AlexSQLForums - Monday, December 17, 2018 2:26 PM

    When you get "Unable to access database 'XXX' because it lacks a quorum of nodes for high availability. Try the operation again later." it means your servers are not communicating properly.
    what was the issue on Saturday? what's your AG's session timeout?
    i don't think a time sync would cause a fail over.
    if some of your db's were showing not synchronizing that means there was no connection with primary/secondary servers.

    Session timeout is 360.  

    All of what you are saying makes sense, except that this only occurs when the read-only secondary is set to synchronous.  Some of the databases work fine, other are not accessible with the above error.
    Everything works as intended when we change it to asynchronous mode.  
    We are skeptical that a time sync would do that, but it did change the time a minute and 45 seconds. At this point, it's all we have.  We had another cluster fail over last night after a time sync.  

    This AG/cluster has not changed in three years, aside from quarterly patching and re-boots.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply