File share witness in an Always On cluster

  • Another DBA mistakenly created a file share witness for one of our ao clusters. To make matters worse, it's pointed at a server we decommissioned. The cluster doesn't seem to be negatively impacted by it, but it is throwing a set of critical errors dozens of times a day and I hate that. Our cluster has no shared storage whatsoever. There's no cloud resources involved. Is there any risk associated with simply removing the resource from the cluster?

    I've struggled to find an answer for this because it's not something that should have happened in the 1st place.

    • This topic was modified 1 month, 3 weeks ago by  robinpryor.
  • according to the docs: 2024-02-21 15_15_13-What is an Always On availability group_ - SQL Server Always On _ Microsoft Lear

     

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • It depends on how the cluster is configured - and whether or not that witness is needed to maintain quorum.  If you are configured as node majority with a witness - and there are 2 nodes in the cluster, then you have 3 possible quorum votes.

    Since the file witness is down - the cluster is still healthy because you have 2 votes (nodes 1 & 2).  If Node 1 is taken down (patches - for example) then the cluster is no longer healthy and Windows will shut down the cluster service.  In order to maintain a healthy cluster you need *more* than 50% quorum votes - 2 of 3 is more than 50% but 1 of 3 is not.

    Now - if you remove the witness then your cluster will not be healthy if either node goes down, because 1 of 2 is *not* more than 50% quorum votes.

    Even if your clusters are configured with dynamic quorum - if you don't have enough quorum votes the cluster will no longer be healthy and will be shut down by Windows.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • robinpryor wrote:

    Another DBA mistakenly created a file share witness for one of our ao clusters. To make matters worse, it's pointed at a server we decommissioned. The cluster doesn't seem to be negatively impacted by it, but it is throwing a set of critical errors dozens of times a day and I hate that. Our cluster has no shared storage whatsoever. There's no cloud resources involved. Is there any risk associated with simply removing the resource from the cluster?

    I've struggled to find an answer for this because it's not something that should have happened in the 1st place.

    For the underlying WSFC for your AG, just create a new file share witness and configure the cluster to point to it. No problem at all.

  • Once again - we have absolutely no shared storage in this cluster. None. There are actually 3 nodes - 2 synchronous - 1 asynchronous for DR. The problem isn't that the witness is pointed at a server that doesn't exist. The problem is that it shouldn't have been there in the 1st place. It's been failing for a year and a 1/2 without anyone even noticing, so the cluster health is fine. What I'm looking for is feels about whether or not it will hurt anything to remove it now that it is there.

    Honestly, we've pretty much decided to just go for it because I cannot find any answers on this. Everyone wants to explain what a witness is. I know what a witness is. It seems to be very difficult to find an answer on negative impact of removing something that's not supposed to be there anyway. I'm going with that 1st answer. At least that one acknowledges that I said that it shouldn't have been created.

  • Well a file share isn't shared storage. Just for the record. Obviously you want us to authorize you to remove it. Go ahead remove it and tell us what happend. If your cluster is configured with node majority then nothing will happen. Just a quick google search would have enlighten you. I look forward to the result. We want to learn something

     

  • I will add a few more points to consider:

    1. In a 3-node cluster - where one of the nodes is in another data center, what happens if you lose connectivity to that data center?
    2. What will happen if you lose the primary data center?

    As you are currently configured (assuming no file witness since it isn't working) - if you lose connectivity to the DR site and then one of the nodes in the primary data center goes down your cluster will be down.

    If the primary data center goes down - you will not be able to bring up the secondary in the DR site without a cluster reconfiguration.

    If you did have a working file (cloud) witness accessible from both DC's you could set the node weight on the DR node to 0, removing that node from quorum.  Then, if you lose connectivity to the DR site you have no issues - still have 3 votes and can restart either node with no impact.  Additionally, upon failover - you can reset the node weight and bring up that secondary with a healthy cluster.

    By keeping the file (cloud) witness and including that in the dynamic quorum configuration - the witness would not be included if all 3 nodes have a vote and would only be considered if one of the nodes is down.  That would ensure you always have at least 2 of the 3 quorum votes needed to maintain the cluster.  It would also be available for the DR system and would maintain 2 of 3 quorum votes if both nodes in the primary data center are down.

    Saying that file witness *should never have been added* is not correct.  By having the file witness you have more flexibility in the cluster to handle individual node outages as well as data center outages and network outages.

     

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply