DBCC CheckDB runs slow, doesn't seem to use TempDB, and waits on RESOURCE_SEMAPHORE

  • For several nights now, a maintenance plan that had been working perfectly stopped working. It's a simple CheckDB to every DB on the instance plan. It doesn't error it just runs all night long, and is still running the next day.

    Previously, it had completed in ~1h 30m.

    So I start running it manually, and on a single DB. It took a full 1h 30m for a 1.5GB GB. While it's running, I see it frequently waiting on a RESOURCE_SEMAPHORE. The machine is a development machine, and only under light use at the moment, and has 4GB of RAM allocated to this instance.

    There have been a couple of tempDB issues in the windows logs from a few days back, so I start poking around with that. The disk has plenty of room (50gb free), tempDB is allowed to autogrow, and is split into 4 data files, and are at 200MB each.

    I've read that CheckDB tries to snapshot the DB being checked into TempDB, so I thought I'd hit on the issue. But I've tried several things and nothing seems to help. Increased TempDB's size to 500MB/file. Verified that instant file initiationation is configured to make sure any autogrows go faster. No help.

    Then I reran the checkDB, and found a few scripts that examine TempDB usage. While it's running, it's not allocating any TempDB for itself.

    So then I ran the same thing against our QA server, identical in every way (or at least as close as these things can be ). It runs fast there, and I can see it allocating a bit of tempDB to itself.

    My next step was to check locks. I know if for some reason, it can't use TempDB for the checkDB, it will lock the DB and do it that way.

    Both in Dev and QA, I see DB locks while check DB is running. The difference is Dev, the problem server, is only grabbing one lock from sys.dm_tran_locks. On QA, the fast one, I see it grabbing 4.

    I'm more than a bit stumped now. Any ideas?

    It's a long shot, but my next investigation will be to make sure the virtual the dev is on actually has all of its CPUs running, and there's nothing crazy like a CPU affinity set for SQL. At this point, I should mention that I did check to look at the Max DOP setting, and it's set to 0 on the server.

  • Hi,

    have you checked the non-default configuration changes of both servers (in "Server Dashboard" report)? Perhaps there are some differences.

    Greets

    [font="Arial"]Kind regards,

    Patrick Fiedler
    Consultant

    Fiedler SQL Consulting

    www.fiedler-sql-consulting.de
    [/font]

  • CheckDB doesn't use TempDB except for indexed view checks. It creates a database snapshot and checks that. No locks unless you specify WITH TABLOCK

    Often an increased duration means it's found something wrong and needs to go back and check deeper. Anything in the error log? If it starts a deep dive, I believe it will log it (on 2008)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Patrick: No differences on the configurations between the two servers.

    Gail: Thanks for the clarification. The snap for indexes makes sense, but I definitely see a lock requested and granted an an otherwise idle DB only when I run the checkDB against it. It's probably a red herring though... it's a shared database level lock that shows up in

    To answer your other question, the only thing I see in the logs is it iterating though the DBs on the server. Comparing to prior to when this started, the time difference is 8 seconds originally, 56 minutes since the problem surfaced for a given single DB. It's finding no errors or reporting problems, just taking orders of magnitude more time.

    Having said all that, I just heard back from our infrastructure guy, and it sounds like there are potential I/O issues with the underlying disks of the dev box. I'm guessing checkDB thrashes the hell out of disks, so this definitely seems like it could be the cause.

  • llevity (4/8/2011)


    Gail: Thanks for the clarification. The snap for indexes makes sense, but I definitely see a lock requested and granted an an otherwise idle DB only when I run the checkDB against it. It's probably a red herring though... it's a shared database level lock that shows up in

    The snapshot's for the entire DB, not indexes. TempDB is used for materialised views (indexed views)

    Every connection to a DB takes a shared database lock. It's so that the DB can't be dropped, closed, detached, restored while there are open connections using it.

    I'm guessing checkDB thrashes the hell out of disks, so this definitely seems like it could be the cause.

    Yup, it reads every single page of the DB. Very IO intensive

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Okay, I think it's finally sunk in.

    It always takes a snapshot of the db and checkdb actually runs against the snapshot. It will only lock the db if you specifically tell it to, in lieu of the snap.

    Tempdb really only comes into play with indexed view.

    And finally, the shared db lock I saw is the same lock that makes me click the "close connections" box when I'm trying to restore over an existing db and I still have another connection to it in another window. 😛

    I'm mainly surprised about tempdb. I thought it had a bigger involvement. I know I've run into an issue where checkdb essentially filled up a tempdb, but I don't remember the specifics. Maybe the db made heavy use of indexed views, or tempdb was just sized way too small.

    Anyhow, thanks for the help, and the corrections. Slowly but surely, I learn 🙂

  • Yup.

    It's odd that it's a resource semaphore wait though. IO related I would have expected IO latches rather.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • That's true. As I understand it, the resource semaphore basically means it's waiting on a chunk of memory to be granted to the task?

    The only thing I can think of is the io issues are causing memory pressure somehow too.

    I guess Monday I can look up some counters to check for memory pressure.

  • I hope nobody enabled drive (lun) compression for your disk(s) or folder(s).

    (imo that would be noticable more than just with checkdb unless you realy have litte activity)

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply