Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Mirroring - Intermittent "network name is no longer available" Expand / Collapse
Author
Message
Posted Monday, September 9, 2013 1:41 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, August 18, 2014 2:03 PM
Points: 36, Visits: 203
Is this a network issue?

I have a mirroring setup - synchronous with automatic failover. The databases went into Disconnected mode for about a minute last night, and everything was synchronized soon after. Looking at the error logs, the principal server could not contact the secondary server. The secondary server could not contact the witness. But I don't see a message that the principal lost contact with the witness.

There weren't any index defrag jobs running, or heavy activity at the time. This is the first time the issue has happened around that time of day.

Principal error log
12:04:06 AM Database mirroring connection error 4 '64(The specified network name is no longer available.)' for 'TCP://prod02.dg.com:5023'.
12:04:27 AM Database mirroring connection error 2 'Connection attempt failed with error: '10060(A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.)'.' for 'TCP://prod02.dg.com:5023'.

Secondary error log
12:03:46 AM Database mirroring connection error 4 'An error occurred while receiving data: '64(The specified network name is no longer available.)'.' for 'TCP://PROD-WITNESS.dg.com:5024'.
12:03:46 AM Database mirroring connection error 4 '10054(An existing connection was forcibly closed by the remote host.)' for 'TCP://prod01.dg.com:5022'.

I am responsible for doing a root cause analysis, and am in a DBA role. Does this look like a network issue? If so, is there anything I can suggest that my IT department look into to pinpoint the problem? We've had similar issues intermittently, and they haven't found anything. We use VMWare.

How can I tell if there was a loss of quorum in a mirroring session? There were application timeout errors, and trying to determine whether these were due to the mirroring issues, or just the network. Any thoughts you have would be appreciated.

Thanks!
Dan
Post #1492909
Posted Monday, September 9, 2013 1:46 PM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 1:56 PM
Points: 1,387, Visits: 1,576
We just experienced a long period of network performance issues that resulted in a lot of those same errors in SQL backups. The root cause was a bad drive in the SAN.
Post #1492911
Posted Monday, September 16, 2013 8:17 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, August 18, 2014 2:03 PM
Points: 36, Visits: 203
Thanks for your response. I will keep that in mind.
Post #1495108
Posted Monday, May 5, 2014 1:50 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, June 9, 2014 2:33 PM
Points: 219, Visits: 457
Hi, I did a recent in-place upgrade from SQL Server 2008 R2 to SQL Server 2012 SP1 CU8 and dropped and re-added mirroring and now we are having the same issue with the intermittent connectivity.

Did you ever resolve your trouble, and if so, how?

Post #1567700
Posted Monday, May 5, 2014 2:22 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, August 18, 2014 2:03 PM
Points: 36, Visits: 203
There were errors in the mirror server Windows log, and I assume that it is a problem with the disk subsystem.

Event ID: 129
Reset to device, \Device\RaidPort3, was issued.

Event ID: 129
Reset to device, \Device\RaidPort2, was issued.

Unfortunately, I don't think we resolved it, as issues happen occasionally. But it's beyond my expertise or permission as a DBA. Thankfully, it doesn't happen all the time.
Post #1567706
Posted Tuesday, May 6, 2014 5:01 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 4:14 AM
Points: 2,868, Visits: 3,215
I have had situations in the past where SQL Serve reported a network problem but the Network people said nothing had gone wrong.

Eventually we wrote a simple script that did a PING every second and saved the result. This showed intermittent network problems, and the Network people had to accept that something had gone wrong. Eventually they found and fixed the problem, but as with anything that is intermittent it took a while to sort out.

If you can prove that the problem happens outside of SQL Server, then you have a better chance of the subject matter experts accepting a problem exists and getting it fixed.


Original author: SQL Server FineBuild 1-click install and best practice configuration of SQL Server 2014, 2012, 2008 R2, 2008 and 2005. 28 July 2014: now over 30,000 downloads.
Disclaimer: All information provided is a personal opinion that may not match reality.
Concept: "Pizza Apartheid" - the discrimination that separates those who earn enough in one day to buy a pizza if they want one, from those who can not.
Post #1567874
Posted Tuesday, May 6, 2014 7:31 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 1:41 PM
Points: 6,318, Visits: 13,623
PianoDan (9/9/2013)
But I don't see a message that the principal lost contact with the witness.

If it had the Principal database would have gone offline, did this happen at all?
You may need to adjust your mirror timeout for this mirror session, the default is 10, try raising it a little to cope with network outages.


-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs"
Post #1567937
Posted Tuesday, May 6, 2014 12:16 PM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Monday, June 9, 2014 2:33 PM
Points: 219, Visits: 457
Your messages actually look a lot like the messages we receive from our SAN during a backup, that may be something to check if there are SAN backups occurring they will freeze the I/O and do a reset to Device. We don't have that anymore after changing to Avamar backups. Those messages could also indicate SAN disk pressure, I have seen that in the past as well, but I would definitely see if it correlates to some kind of SAN backup.
Post #1568129
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse