Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Ownership of cluster disk 'Cluster Disk xxx has been unexpectedly lost by this node.


Ownership of cluster disk 'Cluster Disk xxx has been unexpectedly lost by this node.

Author
Message
sql-lover
sql-lover
SSChasing Mays
SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)

Group: General Forum Members
Points: 651 Visits: 1930
My Cluster went down again. I don't have to say... I am having a not so good morning already ... :-(

Here's the Cluster's error:


Ownership of cluster disk 'Cluster Disk xxx' has been unexpectedly lost by this node. Run the Validate a Configuration wizard to check your storage configuration.


This looks to me as a Os or SAN error. The LUN is gone, then SQL goes down. Now, following our SAN admin's advice, we did apply this patch: http://support.microsoft.com/?id=2718576 ... but does not look like it resolved the main issue.

We just started having this issue few weeks ago. But it was running fine for a two month period, maybe a bit more but with less workload.

Has someone experienced this problem before?
Perry Whittle
Perry Whittle
SSCrazy Eights
SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)

Group: General Forum Members
Points: 8780 Visits: 16554
are you using iSCSI attached storage and MPIO?
Can you provide a little more info on the storage and the connectivity from the nodes?

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" ;-)
sql-lover
sql-lover
SSChasing Mays
SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)

Group: General Forum Members
Points: 651 Visits: 1930
Hi Perry,

The shared storage is a Dell Compellent SC8000 SAN, connected via iSCSI / MPIO to both nodes. The Windows Cluster runs on Win2008R2 SP1. MS-SQL runs SQL2012 Standard.

I also found this error on Windows log:


Connection to the target was lost. The initiator will attempt to retry the connection.


It clearly looks like an iSCSI / MPIO issue. On both incidents, the iSCSI mapping got lost, then SQL went down.

Our SAN expert advice is remove MPIO ???? Ermm ... but I've helped configuring and deploying dozens of SQL Clusters before with MPIO, and this is the 1st time I see this problem. Moreover, I believe removing MPIO will create Cluster validation issues and data corruptions.
Perry Whittle
Perry Whittle
SSCrazy Eights
SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)

Group: General Forum Members
Points: 8780 Visits: 16554
sql-lover (4/23/2013)
Hi Perry,

The shared storage is a Dell Compellent SC8000 SAN, connected via iSCSI / MPIO to both nodes. The Windows Cluster runs on Win2008R2 SP1. MS-SQL runs SQL2012 Standard.

I'm assuming you're using the Microsoft iscsi initiator?
Are you using the default MPIO driver or a Dell DSM?
If the MS driver what policy are you using?



sql-lover (4/23/2013)
Our SAN expert advice is remove MPIO ???? Ermm ....

Some expert huh? ;-)
Without multi pathing things could be a whole lot worse.

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" ;-)
sql-lover
sql-lover
SSChasing Mays
SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)

Group: General Forum Members
Points: 651 Visits: 1930
The MCS policy was set to "round robin".

Now, I do believe we are using the default Microsoft MPIO driver, but where can I check than on Windows and confirm? I do not remember where ...

Also, forgot to mention and I actually was not aware about this until yesterday, we do not have two switches but only one and both nodes are connected to same switch. That actually defeats part of MPIO purpose, I think. Not sure why our IT resource made it that way.
Perry Whittle
Perry Whittle
SSCrazy Eights
SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)

Group: General Forum Members
Points: 8780 Visits: 16554
sql-lover (4/24/2013)
The MCS policy was set to "round robin".

You use either MPIO or MCS not both, so, are you using MCS or MPIO?
MCS is specific to the Microsoft iSCSI Initiator and comprises single session\multiple connection.
MPIO uses multiple sessions.
For more info on iSCSI see my article at this link.


sql-lover (4/24/2013)
The MCS policy was set to "round robin". Now, I do believe we are using the default Microsoft MPIO driver, but where can I check than on Windows and confirm? I do not remember where ...

Open the Microsoft iSCSI Initiator console, select the disk device and open the properties. You should see the MPIO button which will open the MPIO properties to view\change.



sql-lover (4/24/2013)
Also, forgot to mention and I actually was not aware about this until yesterday, we do not have two switches but only one and both nodes are connected to same switch. That actually defeats part of MPIO purpose, I think. Not sure why our IT resource made it that way.

When using storage multi pathing one would sort of hope that the hardware would be in place to support the topology otherwise a switch hardware failure will leave MPIO redundant!!
You should have more than 2 switches for your iSCSI network. A typical topolgy would have at least 2 core switches with edge switches feeding off these to provide multiple redundant paths down to your storage. This is all detailed in my article linked above.

The whole point of multi pathing is to allow Windows server to host highly available local SAN disks otherwise the OS would see the multiple paths as separate disk devices, which they are not.

With 10GBoe available you're exceeding the capabilities of a standard FC setup ;-)

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" ;-)
sql-lover
sql-lover
SSChasing Mays
SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)

Group: General Forum Members
Points: 651 Visits: 1930
Perry Whittle (4/24/2013)

When using storage multi pathing one would sort of hope that the hardware would be in place to support the topology otherwise a switch hardware failure will leave MPIO redundant!!

You should have more than 2 switches for your iSCSI network. A typical topolgy would have at least 2 core switches with edge switches feeding off these to provide multiple redundant paths down to your storage. This is all detailed in my article linked above.

The whole point of multi pathing is to allow Windows server to host highly available local SAN disks otherwise the OS would see the multiple paths as separate disk devices, which they are not.

With 10GBoe available you're exceeding the capabilities of a standard FC setup ;-)


You are correct and I understand that! It has been very difficult to explain and support my arguments though. I've been questioned a lot (knowing this by experience) and it is really FRUSTRATING! Sad

Anyway, I appreciate the follow up. I can check those other settings you mention, I'll post once I get that ...
Perry Whittle
Perry Whittle
SSCrazy Eights
SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)

Group: General Forum Members
Points: 8780 Visits: 16554
sql-lover (4/24/2013)
It has been very difficult to explain and support my arguments though. I've been questioned a lot (knowing this by experience) and it is really FRUSTRATING! Sad

point them to my article ;-)

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" ;-)
sql-lover
sql-lover
SSChasing Mays
SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)SSChasing Mays (651 reputation)

Group: General Forum Members
Points: 651 Visits: 1930
Just in case someone else is reading this thread and face a similar issue.

Our IT guy / SAN expert contacted Microsoft. He had a meeting with them and the Microsoft engineer revised the whole Cluster implementation. He did not find anything wrong on MS-SQL and its configuration but suggested these two Os changes:

-Change the network binding order. Put HeartBeat second and SAN last (SAN was 2nd and heartbeat the last one)
-Assign fix IP values on the iSCSI initiator properties

While I was absent during the meeting, I do not understand the 1st suggestion. It is usually how I setup my Cluster implementations. I'll give a try to the second one though.
Perry Whittle
Perry Whittle
SSCrazy Eights
SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)SSCrazy Eights (8.8K reputation)

Group: General Forum Members
Points: 8780 Visits: 16554
sql-lover (5/2/2013)
-Change the network binding order. Put HeartBeat second and SAN last (SAN was 2nd and heartbeat the last one)

I can sort of see why but I don't see this is to relevant, the cluster communication can still take place over the public network (the default setting)


sql-lover (5/2/2013)
-Assign fix IP values on the iSCSI initiator properties

Now this is relevant, your heartbeat and iscsi adapters should be set to not register themselves in DNS. Always provide fixed IP details to the initiator disk device connection to ensure the correct adapters are bound. My article linked above details this.

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" ;-)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search