Application starts throwing SQL error after database failover.

  • I have a situation and looking for some guidance. I have a two node SQL server always on on setup for a HA solution. the database is setup for automatic fail-over and listener is setup as well.

    Recently we had a network issue and  the database failed over to the secondary node, which became the primary. From my end every thing seems to be fine.

    The databases were available on node B now. I was able to query. use the listener name to connect using SSMS. However, as soon as after the fail-over, the application stop and a pop up error shows saying "SQL error". In the logs of node B I don't see any errors but I also don't see any request coming to Node B either. It's like the application is not sending any request to node B. As soon as I fail-over back to node A, the application starts working instantaneously. This what I have tried so Far

    - Asked the  application team to make sure the connection or what ever configuration they r using to point to the database, use it with the  listener name only. I have scripted out the logins from server A with original SIDs and implemented on server B. I don't see any login failed error in the sql logs  when server B is the primary so I don't think it's the login issue but lets see.

    Both of the  SQL server  nodes are within same sub-net. The listener IP is using just one subnet.

    Hope you  guys can help.

    Regards,
    Bobby

  • hello Bobby -- have you considered logging in from the problematic app host and trying a ping, then a simple sqlcmd login just to check if it would go though? usually the error thrown would be indicative of the issue.

    a few ideas that come to mind that might help would be: check the connection string of the app; check if you have a DNS alias issue, a typo, incorrect connection string params, try the same test from a different host or a host that is known to work.

    I would personally try different ways of connecting to the replica to try to get an error message and narrow down the issue.

    hope this helps
    ivan

  • Hello.

    I haven't had a chance to actually try and work with on the application server. However, just yesterday application guys send a snap-shot of the logs they say appears in their logs when issue happens.

  • SQL Server]Unable to access availability database'XXXX' because the database replica is not in the PRIMARY or SECONDARY role.

  • Connections to an availability database is permitted only when the database replica.......
  • I understand this to appear when the network issue was happening, or the DB is in the process of moving from primary to secondary node. However, once the fail-over is complete this should disappear.

  • This appears after failover is complete? I would double check all the configuration and make sure every database you expect is in the AG. Make sure the database is not in recovery.

  • Hi, Thanks for the response.

    Yes, I checked it, once the fail over is complete, all the databases are up sync and available.

    Regards

  • hello -- you might already be aware that the driver they use will have an impact?
    But just in case here's the link: https://docs.microsoft.com/en-us/sql/database-engine/availability-groups/windows/always-on-client-connectivity-sql-server

    e.g.

    DriverMulti-Subnet FailoverApplication IntentRead-Only RoutingMulti-Subnet Failover: Faster Single Subnet Endpoint FailoverMulti-Subnet Failover: Named Instance Resolution For SQL Clustered Instances
    SQL Native Client 11.0 ODBCYesYesYesYesYes
    SQL Native Client 11.0 OLEDBNoYesYesNoNo
    ADO.NET with .NET Framework 4.0 with connectivity patch*YesYesYesYesYes
    ADO.NET with .NET Framework 3.5 SP1 with connectivity patch**YesYesYesYesYes
    Microsoft JDBC driver 4.0 for SQL Server
  • Check that the app is connecting to the listener, not to the server directly.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • qur7 - Thursday, February 22, 2018 1:45 PM

    I have a situation and looking for some guidance. I have a two node SQL server always on on setup for a HA solution. the database is setup for automatic fail-over and listener is setup as well.

    Recently we had a network issue and  the database failed over to the secondary node, which became the primary. From my end every thing seems to be fine.

    The databases were available on node B now. I was able to query. use the listener name to connect using SSMS. However, as soon as after the fail-over, the application stop and a pop up error shows saying "SQL error". In the logs of node B I don't see any errors but I also don't see any request coming to Node B either. It's like the application is not sending any request to node B. As soon as I fail-over back to node A, the application starts working instantaneously. This what I have tried so Far

    - Asked the  application team to make sure the connection or what ever configuration they r using to point to the database, use it with the  listener name only. I have scripted out the logins from server A with original SIDs and implemented on server B. I don't see any login failed error in the sql logs  when server B is the primary so I don't think it's the login issue but lets see.

    Both of the  SQL server  nodes are within same sub-net. The listener IP is using just one subnet.

    Hope you  guys can help.

    Regards,
    Bobby

    Hi,
        if, as I suspect, the nodes are on different subnets, to get automatic failover you need to ensure that your connection string has the MULTISUBNETFAILOVER=TRUE option is set, and that the driver, and if applicable .NET framework version supports it.  If not, then either you have to repoint manually, or it's not going to resolve until after your next DNS refresh

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • Thanks for the feedback.

    the two nodes are in the same subnet. I was aware about the different sub-net issue however didn't think it applied in my case.
    we resolved the issue. one of the configuration from the application side was using node A server name instead of listener name. once this was corrected, it all worked.
    Thanks for the help.

    Regards,

  • qur7 - Tuesday, February 27, 2018 7:06 AM

    Thanks for the feedback.

    the two nodes are in the same subnet. I was aware about the different sub-net issue however didn't think it applied in my case.
    we resolved the issue. one of the configuration from the application side was using node A server name instead of listener name. once this was corrected, it all worked.
    Thanks for the help.

    Regards,

    good to know. checking the connection string was part of my earlier suggestion. 🙂

  • Viewing 10 posts - 1 through 9 (of 9 total)

    You must be logged in to reply to this topic. Login to reply