Split-brain mirroring

  • Hi Guys,

    Some time ago we had a split-brain problem with our mirroring. (We use witness)

    Can anybody that has implemented mirroring tell me how likely this situation could happen again?

    Thanks,

    Luiz.

  • Please elaborate on your findings, situation and results or side effects.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Please describe in detail what happened.


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • Here is what happened: we had Principal, mirror and witness setup for one database.

    Before the split-brain:

    - In Machine 1, we had DatabaseName (Principal, Synchronized)

    - In Machine 2, we had DatabaseName (Mirror, Synchronized / Restoring...)

    - Machine 3 was the witness

    After the split-brain happened:

    - In Machine 1, we had DatabaseName (Principal, Synchronized)

    - In Machine 2, we had DatabaseName (Principal, Synchronizing)

    - Machine 3 was still the witness

    I know this happened with other people and cases like this have been reported before.

    Can anybody that had experienced a case like this advice? Is there a way to be sure that this won't happen again?

    Thanks.

  • It could be a perceived split brain scenario. If you are above the builds mentioned in the KB below, then enable trace flag 1439 and 3605 to get the synchronization information printed in the SQL Errorlog. That will confirm if the database restart task is stuck.

    Link:

    http://support.microsoft.com/default.aspx?scid=kb;EN-US;983500

    This posting is provided "AS IS" with no warranties, and confers no rights.
    My Blog: TroubleshootingSQL
    Twitter: @banerjeeamit

  • This is only documented in cases where there is no witness. Not possible (supposedly) if using a witness.

    What happened in between? Was there a failover triggered or an outage or a system reboot?

    Could you have been hit by this bug: http://support.microsoft.com/default.aspx?scid=kb;EN-US;983500


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • Thank you for all your answers. I'm analyzing each one.

    About Robert's question: "Could you have been hit by this bug: http://support.microsoft.com/default.aspx?scid=kb;EN-US;983500"

    Yes, that seems to be the same problem. At least it is the same symptoms: "The status of both databases is Principal". I don't know it it is caused by the same problem.

    This article recommends the upgrade, but it also says:

    "This fix does not improve the performance role synchronization. This fix helps find obstructions in the database restart task by checking the SQL Server error log file."

    And more:

    "Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the 'Applies to' section."

    So, is it saying that even if I do everything I can to have a good mirroring, I can still have split-brain?

    Am I interpreting this right?

    Thanks.

  • No, read the kb article again. It's not split brain. If you try to query the former principal, you'll get an error.

    To have a split brain scenario, both servers must be serving data and executing queries.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Good points, GilaMonster, My case was different, since I was able to query both databases.

    About the other commentary: "This is only documented in cases where there is no witness. Not possible (supposedly) if using a witness"

    The split-brain cases that I have found in the internet happened using witness. For example:

    http://serverfault.com/questions/251316/what-could-cause-mirrored-databases-to-both-become-the-principal-database

    http://social.msdn.microsoft.com/Forums/en/sqldatabasemirroring/thread/c1c71c05-6a17-4b89-9070-98cad9551616

    http://blogs.msdn.com/b/pedram/archive/2007/04/05/sql-server-2005-database-mirroring.aspx (example 5)

    http://sqlblog.com/blogs/tibor_karaszi/archive/2010/04/16/mirroring-what-happens-if-principal-loses-contact-with-both-mirror-and-wittness.aspx

    etc

    What I understand is that without witness, the only way to have split-brain is manually promoting the mirror to principal.

    The witness is there to stop a possible split-brain, but there seems to be cases where it actually causes the split-brain.

    Any thoughts?

    Thanks a lot.

  • You can query both databases but DML on the former principal would fail. I have not noted any CU8 SQL2K5 SP3. So if you are above that build and DML on the former principal is facing, then you have a stuck DB Restart Task as I have mentioned in KB983500. You can use the sys.dm_exec_requests query to check that information. The update mentioned in the article only adds additional diagnostic capability for you to identify such a scenario more easily.

    If an application (like replication agent, SQL Agent job or any other client application) constantly logs onto the database and prevents the db restart task from acquiring a DB lock, then the state of the principal will not change to mirror. It could also happen due to orphaned DTC transactions if you have DTC transactions occurring on the mirrored database.

    This posting is provided "AS IS" with no warranties, and confers no rights.
    My Blog: TroubleshootingSQL
    Twitter: @banerjeeamit

  • can you reveal the version info of your sqlserver instances that participate in this dbmirroring setup ?

    @@version

    or

    Select Serverproperty( 'Edition' ) as Edition

    , Serverproperty( 'ProductVersion' ) as ProductVersion

    , Serverproperty( 'ProductLevel' ) as ProductLevel

    , Serverproperty( 'ResourceLastUpdateDateTime' ) as ResourceLastUpdateDateTime

    , Serverproperty( 'ResourceVersion' ) as ResourceVersion

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Here is what I have now:

    Principal:

    Microsoft SQL Server 2005 - 9.00.3042.00 (X64)

    Feb 10 2007 00:59:02

    Copyright (c) 1988-2005 Microsoft Corporation

    Standard Edition (64-bit) on Windows NT 5.2 (Build 3790: Service Pack 2)

    Mirror:

    Microsoft SQL Server 2005 - 9.00.3042.00 (X64)

    Feb 10 2007 00:59:02

    Copyright (c) 1988-2005 Microsoft Corporation

    Standard Edition (64-bit) on Windows NT 6.0 (Build 6002: Service Pack 2)

    Witness:

    Microsoft SQL Server 2005 - 9.00.2047.00 (Intel X86)

    Apr 14 2006 01:12:25

    Copyright (c) 1988-2005 Microsoft Corporation

    Express Edition on Windows NT 6.0 (Build 6002: Service Pack 2)

    I'll update to SP4. After that, can I be sure that I won't get any split-brains? All opinions are appreciated.

    Thanks.

  • Those are pretty old builds. At this point, I am not sure if it was actually a split brain or not. But after you update to the latest CU, enable the trace flag in the KB Article that I mentioned and if you run into this issue *again*, then please contact Microsoft CSS. We are not aware of any such split brain issues post SP4.

    This posting is provided "AS IS" with no warranties, and confers no rights.
    My Blog: TroubleshootingSQL
    Twitter: @banerjeeamit

  • Did you manage to resolve your question ?

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Not yet. An upgrade is scheduled and then I'll observe what happens.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply