• gods-of-war - Wednesday, May 31, 2017 12:09 PM

    W2k12/SQL-2012 11.0.5058
    Failover Cluster, 2 nodes.
    Worked for a year and a half, then we had a drive go bad on the SANS which pusshed drives out via DataCore. We replaced the drive, restarted the erring instance and waited for the replication to complete (it took some time as the two Datacores are on different hospital campusses several miles away from each other.
    Once replication was done, we drained Node 2, rebooted the physical server and let it come back up. Then we went to drain Node 1 (The Primary hosting the SQL Cluster) and it threw the above error for every drive we tried to pause and drain. We went ahead and rebooted the server and nothing failed over at all, Not Quarum, msdtc, or any of the data drives.  The SQL Role didn't come back up until Node 1 fully rebooted.    
    Both Nodes show as being up, I have stopped and restarted the cluster service on Node 2, Paused and Drained, resumed with no failback, and rebooted multiple times.

    When I try and force the Quorum 
    The error I get reads:
     [NETFTAPI] Failed to query parameters for 169.254.188.212 (status 0x80070490)  
    Source: MWFC/Diagnostic
    Task Category: Cluster Virtual Adaptor  
    Event ID 4110  
    Node: ECWDB2.xxx.xxx

    I have run Cluster Validation and the only errors I get are under networking and are the generic ones you get when you have additional NICs in the machine but they are disabled and when using iSCSI.

    "Node ecwdb2..xxx.xxx and Node ecwdb1.xxx.xxx are connected by one or more communication paths that use disabled networks. These paths will not be used for cluster communication and will be ignored. This is because interfaces on these networks are connected to an iSCSI target. Consider adding additional networks to the cluster, or change the role of one or more cluster networks after the cluster has been created to ensure redundancy of cluster communication."

    "The communication path between network interface ecwdb1.xxx.xxx - iSCSI Team and network interface ecwdb2.xxx.xxx - iSCSI Team is on a disabled network."

    "Node ecwdb2.xxx.xxx is reachable from Node ecwdb1.xxx.xxx by only one pair of network interfaces. It is possible that this network path is a single point of failure for communication within the cluster. Please verify that this single path is highly available, or consider adding additional networks to the cluster."

    Success ecwdb1.xxx.xxx - iSCSI Team 10.50.50.23 ecwdb2.xxx.xxx - iSCSI Team 10.50.50.24 True 0
    Failure    ecwdb1.xxx.xxx - iSCSI Team 10.50.50.23 ecwdb2.xxx.xxx - LAN 10.0.76.89 False 100
    Success ecwdb1.xxx.xxx - LAN 10.0.76.75 ecwdb2.xxx.xxx - LAN 10.0.76.89 True 0
    Failure    ecwdb1.xxx.xxx - LAN 10.0.76.75 ecwdb2.xxx.xxx - iSCSI Team 10.50.50.24 False 100

    Anyone have any ideas?  Evict Node2 from the cluster and bring back in maybe?

    If both nodes are up you shouldn't be forcing quorum.
    What is the quorum model you were using?

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉