AG Listener Names Unavailable for 10-15 Minutes After a Failover

  • I inherited a two node AG on SQL Server 2016 Standard Edition.  There are 5 AGs on it, one for each Citrix related database on the instances.  I have 4 Listeners.  I'm not sure why one of the AG's doesn't have a Listener.  The AG Properties dialog shows that they're all in Synchronous Commit and automatic failover.  Whenever we've done a manual failover to the secondary, the failover process in SQL Server shows everything completed in a matter of 10-20 seconds and synchronization is successful.  However, it has always taken 10-15 minutes after the failover before I can connect to the SQL instances using any of the four listener names.  When this has happened, I 've gotten into the habit of periodically using PING to see when I get a response so I know when to test the SQL connection.  What do we need to investigate to see why the Listener Names are taking so long to respond after a failover?

    Recently, I don't think this is related because we have the above issue for a long time, but recently I'm seeing the below error in the Failover Cluster Manager for one specific listener name.  The Sr. SysAdmin here shows that this one Listener is set up as a static entry while the others are not.  

    Cluster network name resource failed registration of one or more associated DNS names(s) because the access to update the secure DNS Zone was denied.

    Cluster Network name: 'MyClusterName_MyAGName_ASpecificListenerName'
    DNS Zone: 'Hunter.com'

    Ensure that cluster name object (CNO) is granted permissions to the Secure DNS Zone.

  • Check if those 4 listeners were added in AD as computer objects.

    Alex S
  • AlexSQLForums - Sunday, December 9, 2018 5:47 PM

    Check if those 4 listeners were added in AD as computer objects.

    Those 4 Listeners already exist in AD.  That's how the Citrix app currently makes its connections.

  • I suspect after the fail-over if DNS does not get updated you wont be able to connect because the listener will be pointing to old IP, I have seen this in multi-sub net configuration, is your cluster spanning multi-sub nets? than you might have to update HostRecordTTL value

    https://blogs.msdn.microsoft.com/alwaysonpro/2014/06/03/connection-timeouts-in-multi-subnet-availability-group/

  • goher2000 - Monday, December 10, 2018 1:44 PM

    I suspect after the fail-over if DNS does not get updated you wont be able to connect because the listener will be pointing to old IP, I have seen this in multi-sub net configuration, is your cluster spanning multi-sub nets? than you might have to update HostRecordTTL value

    https://blogs.msdn.microsoft.com/alwaysonpro/2014/06/03/connection-timeouts-in-multi-subnet-availability-group/

    That's a good thought, but it isn't multi-subnet.

  • re: the missing listener. It's possible that apps are accessing DBs in the 5th AG by connecting to one of the other listeners.  This will work fine until the 5th AG gets hosted on a different replica than its brothers.  

    re: long timeout, very curious!  It does sound like a cross-subnet issue, but that's not the case here.

  • can you try connecting using listener IP instead of listener name, after fail-over to see if works..

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply