Catching culprits of high tempdb growth

  • Hello,

    We are facing issues with tempdb on our SQL server 2008 clustered instance. The SQL version is Microsoft SQL Server 2008 (SP1) - 10.0.2714.0 (X64) May 14 2009 16:08:52 Copyright (c) 1988-2008 Microsoft Corporation Enterprise Edition (64-bit) on Windows NT 5.2 <X64> (Build 3790: Service Pack 2)

    Once or twice a month, tempdb grows to unexpectedly high values and log file growth fills up the drive. The problem is that it is happening at Random times and there is no trend. We are looking to find out what is causing the tempdb to fill. Is there a script that I can deploy in my environment using which I can find out the culprits that are growing my tempdb? I tried searching through default trace files but those are not very helpful. This has been a headache since last 3 months or so and everytime we have to restart SQL server during production hours to solve the problem.

    Any help would be appreciated.

    Thanks in advance!!

  • Hi,

    Are there any sessions that have been running for a long time? Any by long time I mean running for days. We just had two incidents like that where the tempdb log growth was out of control. We found that the tempdb checkpoint that usually occurs around 70% log utilization was not occurring. The root cause of the issue seemed to be orphan db sessions. Killing the SPID made no difference as the sessions were then essentially un-killable (stuck in KILLED\ROLLBACK state). Unfortunately the only solution was restarting the instance.

  • are you sure it's not an index rebuilding session that is making the tempdb grow? the frequency, of noticing once or twice a month sounds about right.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Question: How big is it getting? Remember, that all sorting of data goes on there and it can get quite large. If you rebuild indexes and say sort in TEMPDB it will grow it as well.

  • Are you using snapshot isolation level?

  • Thanks guys!!

    The last time we faced this issue was on a Friday and tempdb grew during off hours i.e. when there was no one actively monitoring. When I was informed about this, I can tell there were many sessiosn running but I just couldn't isloate which one was culprit. Before I could do more research I was forced to restart SQL.

    Index rebuilding sessions were surely not runnig as those are not done without DBAs knowing this. And we have a properly optimized tempdb with data and log files in different LUNs, proper auto-growth configuration, multiple data files etc. We have set the initial log file size to 40 GB but when it grows it suddenly goes up to 400 GB and thats what the limit of the drive is. Then all the processing stops with tempdb log file full errors and we are out of options.

    And DBCC usertoptions shows isolation level as read committed.

    I am just looking to capture the culprit (be it a long running session, or index rebuild or anything else). Is there a way to automate this so that whenever a growth on tempdb occurs, something gets logged somewhere (in a table, or errorlog etc.) I get to know that this particular thing is the problem.

    Thanks again!!

  • Have a look at the sys.dm_db_task_space_usage and sys.dm_db_session_space_usage DMVs.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks Gail. We faced the problem again today and this helped a lot.

  • j.a.c (3/27/2013)


    Hi,

    Are there any sessions that have been running for a long time? Any by long time I mean running for days.

    Yep, first thing to look.

    Faced the same problem some time ago.

    Turned out it was a bug in the application - closing opening connection "conn" in try, and closing "conn1" in catch.

    Because the issue happened only after an exception in that particular piece of functionality it was also totally random, with no pattern to be found.

    Try to match the timing of tempdb issues with exceptions recorded in the application log.

    _____________
    Code for TallyGenerator

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply