Basic Avaialbility setup but replicas went to resolving mode

  • pols

    SSC Eights!

    Points: 861

    Hi,

    I configured the Basic Alwayson setup in two of the test SQL standard 2017 test servers(A,B).

    As per basic configuration only one db i am able to add it in group. Configuration everything went well with automatic failover from primary-A to secondary-B once i off the sql services in primary server. Now secondary-B is primary became primary.

    I started the sql services in previous A server and off the services in B services.But here instead of failover both servers went to Resolving mode.  To resolve this i on the sql service in B server i waited for whole night whether dbs will sync.I even tried force failover but Nothing worked.

    Now both servers are in resolving state. Not able to understand what to do now.

    The error i am getting as below

    the availability replica for availability group xyz on this instance of sql server can  not become the primary replica because the wsfc cluster was started in force quorum mode.consider performing a forced manua failover (with possible data loss) error 41125.

    I check the WSFC - Ag Role XYZ failed.

    Both nodes are up.

    Now both replicas went to resolve state..:(

    Anything i need to check? Please suggest.

    • This topic was modified 8 months, 4 weeks ago by  pols.
  • as1981

    SSCrazy

    Points: 2744

    Sorry I can't answer the question you asked but I can suggest something that might help avoid it in the future.

    Does the Windows Cluster only have the two nodes in it? If so then it might be worth adding something like a file share witness. This can help stop things like this from occurring. This link might be useful https://docs.microsoft.com/en-us/windows-server/failover-clustering/manage-cluster-quorum

    Also be careful that you don't get something taking all the network bandwidth. In this situation you can get a failover because the cluster can't see active node and failovers. I had this happen with backups.

  • e4d4

    SSCertifiable

    Points: 5755

    Looks like this wasn't automatic but forced failover "because the wsfc cluster was started in force quorum mode". For automatic failover you need proper quorum configuration like as1981 mentioned.

    You have made a real mess but this is labo purpose, or this is a production ? 😉 Now you must start one node in forced failover, resume synchronization and only then perform manuall failover (SQL on both nodes must be up). Then prepare quorum configuration for automatic failover and test.

    https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/perform-a-forced-manual-failover-of-an-availability-group-sql-server?view=sql-server-ver15

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply