"Data Sync of Availability Group is unhealthy" every 15 minutes but no Prod issues?

  • Hey all,
    I am tracking Extended Events session "severity > 10" for something unrelated when I see this every 15 minutes of every hour (14 after, 29 after, 44 after and 59 after every hour).
    This occurs only in the Extended Events session I created for "> 10 Severity". It does NOT show up when monitoring the SSMS AG Dashboard, nor in a separate Extended Events session when I select "Availability Group" events.

    This is in production. We are not seeing AG fail-overs. The users seem totally unaffected, but it concerns me.
    I created the extended event session 3 days ago. 

    Two nodes in the AlwaysOn cluster with matching versions of OS and SQL Server, synchronous AlwaysOn
    - SQL Server 2012 Enterprise SP4
    - (VM) Windows Server 2016  (applied all latest Windows updates 5 days ago)

    In the Extended Event session, every 15 minutes, over a 1 - 2 second interval, I receive about 307 (repeating) errors with 28 unique error codes stating basically the same thing - the secondary is not joined, the data synchronization is suspended, are not connected, are unhealthy and not synchronizing.
    All messages are:
    - severity  16
    - State 1
    - Category 2
    With the error message numbers covering the full range from 41401 thru 41428.

    The lowest level messages includes two messages related to the WSFC. Normally I would think this may be the key, but they are the 13th & 14th messages in the list order.
    Error # 41401 - WSFC service is offline
    Error # 41402 - The WSFC cluster is offline, and this availability group is not available. This issue can be caused by a cluster service issue or by loss of quorum in the cluster.

    Any thoughts?

Viewing 0 posts

You must be logged in to reply to this topic. Login to reply