Check DB MSDB

  • I run DBCC OPENTRAN which results in

    "No open transactions"

    However When I run I get:

    DBCC CHECKDB("msdb") with TABLOCK

    Msg 5030, Level 16, State 12, Line 1

    The database could not be exclusively locked to perform the operation.

    Msg 7926, Level 16, State 1, Line 1

    Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details.

    This is 365x24x7 server/system and would like to avoid stopping the SQL Service or moving it to single user mode to perform the above action.

  • OpenTran just looks for open transactions, but any connection, even if it's not running a query will prevent an exclusive DB lock.

    Why are you running checkDB with the TABLOCK option? If you omit that, it will generate a hidden DB snapshot and won't need to lock the DB.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I have run it without the tablock part before and I get:

    Msg 1823, Level 16, State 2, Line 1

    A database snapshot cannot be created because it failed to start.

    Msg 7928, Level 16, State 1, Line 1

    The database snapshot for online checks could not be created. Either the reason is given in a previous error or one of the underlying volumes does not support sparse files or alternate streams. Attempting to get exclusive access to run checks offline.

    Msg 5030, Level 16, State 12, Line 1

    The database could not be exclusively locked to perform the operation.

    Msg 7926, Level 16, State 1, Line 1

    Check statement aborted. The database could not be checked as a database snapshot could not be created and the database or table could not be locked. See Books Online for details of when this behavior is expected and what workarounds exist. Also see previous errors for more details.

  • This problem can be caused by a 3rd-party file encryption solution or if NTFS compression is being used.

    Also see http://support.microsoft.com/kb/926070

    Error messages may be logged in the SQL Server error log after you run a DBCC command in SQL Server 2005

    In SQL Server 2005, DBCC commands use internal read-only database snapshots. These database snapshots are created on the same drive where the corresponding database data files are located. Database snapshots grow in proportion to the amount of changed data in the database against which the DBCC commands run. If transactional activity continues on this database, the database snapshots that are created by DBCC commands may experience disk space issues. Because the database snapshot files and the actual data files reside on the same disk drive, both sets of files compete for disk space. In this case, application transactions or user transactions are given preference. Therefore, the DBCC commands experience errors and cannot finish.[\quote]

    SQL = Scarcely Qualifies as a Language

  • No NTFS compression or third party applications:

    Get the following message when I look in the SQL Log:

    Time-out occurred while waiting for buffer latch type 3 for page (1:712)

  • You can't do a WITH TABLOCK CHECKDB of msdb (or master or tempdb) as that requires an exclusive database lock to run the allocation checks (equivalent of DBCC CHECKALLOC).

    If it was a space issue, there would be an error to that effect. It's more likely an IO taking too long. Can you look in the error log for any errors saying something like an IO failed to complete or took more than X seconds? If an IO times out and the IO is necessary for the database snapshot to start up, the snapshot (and CHECKDB) will fail.

    Is this a regularly scheduled DBCC CHECKDB of msdb or a one-off bevause you know something is going wrong?

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Paul Randal (5/20/2009)


    . It's more likely an IO taking too long. Can you look in the error log for any errors saying something like an IO failed to complete or took more than X seconds? If an IO times out and the IO is necessary for the database snapshot to start up, the snapshot (and CHECKDB) will fail.

    Is this a regularly scheduled DBCC CHECKDB of msdb or a one-off bevause you know something is going wrong?

    Thanks

    No IO issues reported in the log. We are running the CheckDB as when we try to request job information through SSMS it locks up. The database will no longer back up it comes back with the buffer latch error above

  • ok - looks like you've got something wrong with msdb (obviously 🙂 and it's in the IO subsystem - all operations are stalling waiting for reads to complete.

    If the db can't even be backed up, you're very limited with that you can do - I would have suggested moving it to a different location.

    I'd first look in the IO subsystem control tools to see if there are any issues (depending on what kind of IO subsystem it is). As a last resort you might try power cycling the IO subsystem to see if a 'stuck' IO (e.g. because of a controller bug) becomes unstuck.

    Of course, if you have backups of msdb, then restoring them to a new location would be preferable - but you do still have the IO subsystem problem to deal with.

    Kind of hard to give really objective advice without sitting there with you looking at the various diagnostics. Can you get your hardware admins involved to help out?

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Thanks I work in the network teams and have had the hardware guy give the system a look over previously, and he has found no issues with the hardware.

    However we are quite a small team and he by no means an expert in this area. Like me he has to juggle a number of roles.

    I do some know a bit of Windows Server 2003, you mention IO subsystem control tools. Is it possible to have a look at this myself through windows server 2003?

  • I don't know - totally depends on your IO subsystem vendor and what tools they provide.

    I asked a few other MVPs for opinions and they all say the same thing, from experience - power cycle the IO subsystem would be the best advice.

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • I have spoke to the most experienced hardware guy that I could find.

    He’s unsure what you me by “power cycling the IO subsystem”. He suggests that it either means a reboot or a complete power down.

    Would it be possible to clarify?

  • Power down SQL Server and the IO subsystem, then power them up again.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Have tried this twice but it has not resolved the issue. We have no backup of the msdb because the company that adminied it previously "didn't see the point of taking system backups". Any other idea anyone?

  • totally agreed to what Paul says ...

    When you run checkdb its gonna create a sparse file so that msdb can be used by users while checkdb is running ..

    Latch waits clearly point to the disk issue ...latches have to be held for very less time for page access synchronization ..

    so its trying to create the sparse file , hitting the latch waittime issue ..

    I am very sure that you are also getting the dump generated ..

    i know of on sparse file issue that logged and fixed in SQL Server 2005 SP3 ..

    Abhay Chaudhary
    Sr.DBA (MCITP/MCTS :SQL Server 2005/2008 ,OCP 9i)

  • also check the error logs and copy the errors here if any ......

    Abhay Chaudhary
    Sr.DBA (MCITP/MCTS :SQL Server 2005/2008 ,OCP 9i)

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply