Availability groups (AG’s) is one of my favorite features of SQL Server, despite some DBA’s catalogue them as “too complex” to implement, configure, manage and maintain … I just love AG’s. There is no question that AG’s are a very powerful high availability (HA) and disaster recovery (DR) solution, but as any software product eventually will have some issues but nothing is impossible to solve with a little bit of research.
The error message 19405 is a very common “issue” when configuring AG’s for a Failover Cluster instance (FCI) .
I think you can’t see the error message directly when running the configuration wizard, but you will see something similar to this:
As you can see from the screenshot above, the wizard returned an error when trying to create the availability group called ‘MyAG‘, in addition there was a problem when joining the database ‘MyDB‘ into this availability group.
To be honest, I rarely use the wizard to create AG’s. I prefer to do it through T-SQL because I like to have a little bit more of control when creatingconfiguring and AG and its replicas. Another advantage of using T-SQL is that provides more details whenever something goes wrong. Here is how the error message looks like when creating the same AG through T-SQL:
Aha! The problem is not my database either any of my nodes (FCI’s) the root cause is the way the FCI’s are configured in the Windows Failover cluster (WFC), to be precisely the possible owner of each resource.
I know this could be a little bit confusing at the beginning, but let me explain what is going on:
- When using FCI’s as part of an AG, we are creating a new resource (role) within the WFC in this case is called ‘MyAG‘, but because the existing FCI’s are already part of the WFC the ownership of each FCI’s gets wonky, the WFC automatically adds the new “nodes” as possible owners.
- This is all happening because in my architecture, I have an existing WFC hosting a couple of FCI’s and what I want to do is to have one of those two FCI’s as part of new AG. This AG will be compose of this FCI and and also a standalone SQL Server that will be used as contingency plan in case the WFC goes down, here is a graphical interpretation of my scenario:
That explains why the error message says “remove the overlapped node from its possible owners“, let’s take a look at the issue from the WFC manager console:
As you can see I have two FCI’s running on this WFC let’s call them FCI_1 and FCI_2 just for illustrative purposes, this WFC already has their own nodes node_01 and node_02. In order to review the possible owners of each FCI, I just do a right click over SQL Server (highlighted in orange) and then I click on Properties (highlighted in green) then Advanced Policies:
This is exactly what I was looking for, there is more than two possible owners of my existing FCI resource in this WFC.
As you noticed, there are three possible owners for this FCI, as mentioned earlier this FCI belongs to a WFC of two nodes node_01 and node_02. The question is, what is that third node? Well, simple it belongs to the SQL Server standalone instance I’m using to create the AG which I definitively don’t want as a possible owner of my existing FCI.
The solution of this issue is to proceed removing (unchecking the box) the third “possible owner” from every single resource of all my existing FCI’s on this WFC (SQL Server, Server name, IP Address, MSDTC).
Once the task is completed each resource will look like this:
Noticed how the third node is no longer a possible owner for my resource, once you finish this modification you can re-create the AG and hopefully it will be successfull this time.
Thanks for reading!