Master DB corruption

  • Hi Experts,

     

    Getting below messages when running DBCC CHECKDB on master DB and the session gets disconnected. CHECKDB during restart completed without error but when running checkdb from a query windows giving this error.

     

    Msg 5901, Level 16, State 1, Line 1

    One or more recovery units belonging to database 'master' failed to generate a checkpoint. This is typically caused by lack of system resources such as disk or memory, or in some cases due to database corruption. Examine previous entries in the error log for more detailed information on this failure.

    Msg 1823, Level 16, State 2, Line 1

    A database snapshot cannot be created because it failed to start.

    Msg 1823, Level 16, State 8, Line 1

    A database snapshot cannot be created because it failed to start.

    Msg 7928, Level 16, State 1, Line 1

    The database snapshot for online checks could not be created. Either the reason is given in a previous error or one of the underlying volumes does not support sparse files or alternate streams. Attempting to get exclusive access to run checks offline.

    Msg 5030, Level 16, State 12, Line 1

    The database could not be exclusively locked to perform the operation.

    Msg 7926, Level 16, State 1, Line 1

    Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details.

    Msg 9001, Level 21, State 1, Line 1

    The log for database 'master' is not available. Check the event log for related error messages. Resolve any errors and restart the database.

  • Try to restore recent backup of master db into another db  to check that it's not corrupted.

  • Check for more information in the error log and the Windows event logs. It could also be some storage issues going on.

    Sue

  • Checked those and didnt find any

    Sue_H wrote:

    Check for more information in the error log and the Windows event logs. It could also be some storage issues going on.

    Sue

     

    Checked those and didnt find any

  • The Instance is in a two node failover cluster and after failing over to second node DBCC completed without any error.

  • VastSQL wrote:

    Msg 5901, Level 16, State 1, Line 1

    One or more recovery units belonging to database 'master' failed to generate a checkpoint. This is typically caused by lack of system resources such as disk or memory, or in some cases due to database corruption. Examine previous entries in the error log for more detailed information on this failure.

    Msg 1823, Level 16, State 8, Line 1

    A database snapshot cannot be created because it failed to start.

    Msg 7928, Level 16, State 1, Line 1

    The database snapshot for online checks could not be created. Either the reason is given in a previous error or one of the underlying volumes does not support sparse files or alternate streams. Attempting to get exclusive access to run checks offline.

    Msg 7926, Level 16, State 1, Line 1

    Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details.

    Msg 9001, Level 21, State 1, Line 1

    The log for database 'master' is not available. Check the event log for related error messages. Resolve any errors and restart the database.

    Questions:

    (1) Has the drive hosting the master database got any spare space?  How fragmented is that space?  (I'm thinking that there might not be enough room for sparse files for the snapshot)

    (2) Are there any other processes running that would get in the way of SQL Server as it tries to look at itself (think antivirus / malware detectors)

    (3) Are you running enterprise edition?

    If reboot / failover fixed the problem, then I'm guessing it's probably (2).

     

    Thomas Rushton
    blog: https://thelonedba.wordpress.com

  • Thomas Rushton wrote:

    VastSQL wrote:

    Msg 5901, Level 16, State 1, Line 1

    One or more recovery units belonging to database 'master' failed to generate a checkpoint. This is typically caused by lack of system resources such as disk or memory, or in some cases due to database corruption. Examine previous entries in the error log for more detailed information on this failure.

    Msg 1823, Level 16, State 8, Line 1

    A database snapshot cannot be created because it failed to start.

    Msg 7928, Level 16, State 1, Line 1

    The database snapshot for online checks could not be created. Either the reason is given in a previous error or one of the underlying volumes does not support sparse files or alternate streams. Attempting to get exclusive access to run checks offline.

    Msg 7926, Level 16, State 1, Line 1

    Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details.

    Msg 9001, Level 21, State 1, Line 1

    The log for database 'master' is not available. Check the event log for related error messages. Resolve any errors and restart the database.

    Questions:

    (1) Has the drive hosting the master database got any spare space?  How fragmented is that space?  (I'm thinking that there might not be enough room for sparse files for the snapshot)

    (2) Are there any other processes running that would get in the way of SQL Server as it tries to look at itself (think antivirus / malware detectors)

    (3) Are you running enterprise edition?

    If reboot / failover fixed the problem, then I'm guessing it's probably (2).

    Thanks Thomas.

    1.Yes there was enough space in drive. The server had mount points and all system databases were in root directory which had enough free space.

    2. We had a wild guess about AV however we ignored ,if its holding the files the server wont startup and will do an automatic failover.

    3.Yes Enterprise Edition.

     

  • Rebooted old node and tried again after failover? I assume this is shared storage clustering, so the same master.mdf was scanned with dbcc?

    If so, I'd fail back and try again. If you still have errors, I don't know. I'd lean towards this being transient with the first node in memory, which cleared up on failover.

  • Steve Jones - SSC Editor wrote:

    Rebooted old node and tried again after failover? I assume this is shared storage clustering, so the same master.mdf was scanned with dbcc?

    If so, I'd fail back and try again. If you still have errors, I don't know. I'd lean towards this being transient with the first node in memory, which cleared up on failover.

    Seems like the issue is what you have mentioned. Failover to node 1 and checkdb completed without errors.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply