[SOLVED]Connectivity issue with a multisubnet AlwaysOn

  • Hello,

    I have a problem with a multisubnet that I can't figure out how to resolve.
    I have a 2 replicas AlwaysOn SQL Server 2016 SP2 CU, on a 2 subnets setup.
    On my client, I installed the latest ODBC drivers. It's on a Windows 2008 R2 SP1 VM.
    When I want to create my ODBC on the client machine, I put the listener name as a target (MyListener,1573 where 1573 is my listener port). I checked "Multi-Subnet failover" and "Transparent network IP resolution".
    I checked with  nslookup the IP that I want to resolve . I  see my 2 IPs related to the name.
    But when I test my ODBC, it fails :  [Microsoft][ODBC Driver 17 for SQL Server]Unable to complete login process due to delay in opening server connection

    I installed Wireshark to see what happens, and I do see 2 connections started when I test : One goes to the first IP, the second one to the other IP. One of the connection does it SYN/SYN-ACK/ACK process, the other doesn't.
    So even if one of the connection succeed , the ODBC says it doesn't.
    Working without the multisubnet option work 50% of the times , as expected.

    Do you have any idea where it could come from ?

    Thanks in advance,
    Regards,
    Vincent

  • Have you tried setting RegisterAllProvidersIP = 0 so that only the active IP address is registered in DNS? https://blogs.msdn.microsoft.com/alwaysonpro/2014/06/03/connection-timeouts-in-multi-subnet-availability-group/

    (If you do this you may want to experiment with lowering your HostRecordTTL so that in the event of failover you don't have to wait a long time until the cached DNS record is expired.)

  • No : I would like to avoid such solution, which basically is to give up on the "multisubnet failover" option of the ODBC driver. Considering this issue is for a client, I cannot afford to suggest this kind of solution but on a last resort.

  • It doesn't "give up on" the multisubnet failover capability, it works with it.

  • From my understanding, using it makes the connection much slower as it have to process through all the IP available. It's still a workaround.

  • Quite the opposite; it only registers the active IP, so it is quicker to detect rather than trying all available IPs.

  • But with a TTL of, let's say, 60 second, my client might have some time-out.

  • Yes, hence my point above about possibly lowering it. You have to test things.

  • i will make it easy for you to test

    # see what you have so you can revert

    Get-ClusterResource | where-object {$_.ResourceType.name -eq "IP Address"} | Get-ClusterParameter -Name EnableNetBIOS
    #Get-ClusterResource | where-object {$_.ResourceType.name -eq "IP Address"}| Set-ClusterParameter EnableNetBIOS 0
    #Get-ClusterResource | where-object {$_.ResourceType.name -eq "IP Address"}| Set-ClusterParameter EnableNetBIOS 1
    #Get-ClusterResource | where-object {$_.ResourceType.name -eq "IP Address"}| Set-ClusterParameter EnableNetBIOS 2

    # see what you have so you can revert
    Get-ClusterResource | where-object {$_.ResourceType.name -eq "Network Name"} | Get-ClusterParameter -Name HostRecordTTL

    # Get-ClusterResource | where-object {$_.ResourceType.name -eq "Network Name"} | Set-ClusterParameter -Name HostRecordTTL -Value 300

    # see what you have so you can revert
    Get-ClusterResource | where-object {$_.ResourceType.name -eq "Network Name"} | Get-ClusterParameter -Name RegisterAllProvidersIP

    #Get-ClusterResource | where-object {$_.ResourceType.name -eq "Network Name"} | Set-ClusterParameter -Name RegisterAllProvidersIP -Value 1

    #Get-ClusterResource | where-object {$_.ResourceType.name -eq "Network Name"} | Set-ClusterParameter -Name RegisterAllProvidersIP -Value 0

  • Thanks for the PowerShell commands.
    But do we agree that with that setup, with a TTL of 300 seconds, my client will have a possible 5 minutes period where it will NOT be able to connect to the listener ?
    If so, this is a major issue with this solution.

    Thanks in advance ,
    Regards,

  • OK ! I finally found the cause of the issue !
    It's a Windows bug.

    Documented here  : https://support.microsoft.com/en-us/help/2870437/connection-times-out-when-you-use-alwayson-availability-group-listener
    I found the patch, and everything works fine, now !

    Thanks for your help anyway  🙂

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply