High TempDB usage - Where is the culprit?

  • SQL 2012 SP1 Standard on a VM with 4 Vcores and 40 GB allocated RAM.

    I have noticed that TEMPDB usage is very high lately and of the two 15GB files (30GB total TempDB size), there is only 2.6 GB free. I am trying to find out why this could be. I have executed a query on sys.dm_db_file_space_usage and got the following results (Query1):

    Free Pages: 343608

    Free Space in MB: 2684

    Version store Pages used: 1440

    Version store space MB: 11.25

    Internal Object Pages used: 6144

    Internal Object space MB: 48

    User Object Pages used: 3646000

    User Object space in MB: 28484.375

    So if I am reading this correctly, the user objects are taking up the space in TEMPDB. I then ran a query (Query2) on sys.dm_exec_request joined with dm_exec_sql_text to find the sessions consuming the space, but the numbers don't add up.

    Looking at the sum of sys.dm_db_task_space_usage.user_objects_alloc_page_count for the sessions, the total usage count for all sessions doesn't equal the usage I am seeing in TEMPDB, notable only 2.2 GB of usage. Where as the user object space is reporting a usage of 28GB plus!

    Can anyone give any advice on whether I am reading this correctly and the best way to find the consumption of TempDB? Thanks.

    The queries used are:

    Query 1:

    SELECT

    SUM(unallocated_extent_page_count) AS [free pages], (SUM(unallocated_extent_page_count)*1.0/128) AS [free space in MB],SUM(version_store_reserved_page_count) AS [version store pages used],

    (SUM(version_store_reserved_page_count)*1.0/128) AS [version store space in MB],SUM(internal_object_reserved_page_count) AS [internal object pages used],

    (SUM(internal_object_reserved_page_count)*1.0/128) AS [internal object space in [MB],SUM(user_object_reserved_page_count) AS ,

    (SUM(user_object_reserved_page_count)*1.0/128) AS

    FROM sys.dm_db_file_space_usage;

    Query 2:

    SELECT R1.session_id, R1.request_id, R1.Task_request_internal_objects_alloc_page_count, R1.Task_request_internal_objects_dealloc_page_count,

    R1.Task_request_user_objects_alloc_page_count,R1.Task_request_user_objects_dealloc_page_count,R3.Session_request_internal_objects_alloc_page_count ,

    R3.Session_request_internal_objects_dealloc_page_count,R3.Session_request_user_objects_alloc_page_count,R3.Session_request_user_objects_dealloc_page_count,

    R2.sql_handle, RL2.text as SQLText, R2.statement_start_offset, R2.statement_end_offset, R2.plan_handle FROM (SELECT session_id, request_id,

    SUM(internal_objects_alloc_page_count) AS Task_request_internal_objects_alloc_page_count, SUM(internal_objects_dealloc_page_count)AS

    Task_request_internal_objects_dealloc_page_count,SUM(user_objects_alloc_page_count) AS Task_request_user_objects_alloc_page_count,

    SUM(user_objects_dealloc_page_count)AS Task_request_user_objects_dealloc_page_count FROM sys.dm_db_task_space_usage

    GROUP BY session_id, request_id) R1 INNER JOIN (SELECT session_id, SUM(internal_objects_alloc_page_count) AS Session_request_internal_objects_alloc_page_count,

    SUM(internal_objects_dealloc_page_count)AS Session_request_internal_objects_dealloc_page_count,SUM(user_objects_alloc_page_count) AS Session_request_user_objects_alloc_page_count,

    SUM(user_objects_dealloc_page_count)AS Session_request_user_objects_dealloc_page_count FROM sys.dm_db_Session_space_usage

    GROUP BY session_id) R3 on R1.session_id = R3.session_id

    left outer JOIN sys.dm_exec_requests R2 ON R1.session_id = R2.session_id and R1.request_id = R2.request_id

    OUTER APPLY sys.dm_exec_sql_text(R2.sql_handle) AS RL2

    Where

    Task_request_internal_objects_alloc_page_count >0 or

    Task_request_internal_objects_dealloc_page_count>0 or

    Task_request_user_objects_alloc_page_count >0 or

    Task_request_user_objects_dealloc_page_count >0 or

    Session_request_internal_objects_alloc_page_count >0 or

    Session_request_internal_objects_dealloc_page_count >0 or

    Session_request_user_objects_alloc_page_count >0 or

    Session_request_user_objects_dealloc_page_count >0

  • Do you have any databases that use Snapshot Isolation?

    select name, snapshot_isolation_state

    from sys.databases

    where snapshot_isolation_state > 0

    we have an application that did not free up it's snapshots on a timely basis, so we had to resort to restarting the application once a month.

  • Thanks for the reply. Only one database (bar master and msdb) are using snapshot isolation. The vesioning would be kept in the version store of TempDB though, wouldn't they, so this doesn't look to be the problem.

    One thing I have noticed which may or may not be related is the SQL server performance bombing out at times. I have found that when a user views a view definition, either by looking at the design option of a view or asking for a create/alter script to be generated, then SSMS hangs, SQL server performance drops and PLE drops to zero! I notice millions of reads in the IO and the same for writes. Once I kill off the session trying to do the view definition query, then all starts working as normal. The wait type when this happens is PAGEIOLATCH_SH:tempdb:3(*) (taken from whoisactive) which makes me think this could be to do with tempdb and free space.

    This may just be a coincidence.

  • Maddave (3/5/2014)


    Thanks for the reply. Only one database (bar master and msdb) are using snapshot isolation. The vesioning would be kept in the version store of TempDB though, wouldn't they, so this doesn't look to be the problem.

    One thing I have noticed which may or may not be related is the SQL server performance bombing out at times. I have found that when a user views a view definition, either by looking at the design option of a view or asking for a create/alter script to be generated, then SSMS hangs, SQL server performance drops and PLE drops to zero! I notice millions of reads in the IO and the same for writes. Once I kill off the session trying to do the view definition query, then all starts working as normal. The wait type when this happens is PAGEIOLATCH_SH:tempdb:3(*) (taken from whoisactive) which makes me think this could be to do with tempdb and free space.

    This may just be a coincidence.

    sp_whoisactive can also show you the actual tempdb consumption along with the queries running, query plans, and lots of other details. That should easily allow you to tie tempdb usage to queries causing high usage.

    PLE to zero is often caused by a bug somewhere that is causing a flush of sql memory. I have seen MANY of these over the years, from hardware drivers to windows to SQL Server itself. There are some very nasty memory (and NUMA) bugs in SQL 2012 that you need to be patched up to get around. What is your build level?

    I also note you are on a VM. So now another thing comes into play and that is the VM causing some hoky-poky stuff to happen. Have your VM admin monitor for a variety of things too. There are SOOO many ways you can mis-configure SQL Server and having VM in the mix add to that list significantly.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Many thanks for the reply. I am on build 11.0.3128. I've only applied the fix to the ms installer running bug as I experienced this a while back. I was only told to wait for service packs rather than apply CU's, but I might look to roll these out now.

    I have ended up resolving the issue by bouncing the instance last night. After this, PLE has now grown back to the expected 15 hours rather jumping up and down like it has been over the last few days. Tempdb is also cleared down (as expected after a restart), but running the two queries in my original post both the value for the tempdb in use by users from the first query and the sum of the user allocation of tempdb in the second query, now tally, where as before they were way out. Viewing a view definition now works quickly and doesn't cause a drop in PLE.

    So something was definitely wrong and causing stability issues. I feel I will not know the exact cause of this, but will be better prepared if I see things like this again.

    Thanks again for your help.

  • Maddave (3/6/2014)


    Many thanks for the reply. I am on build 11.0.3128. I've only applied the fix to the ms installer running bug as I experienced this a while back. I was only told to wait for service packs rather than apply CU's, but I might look to roll these out now.

    I have ended up resolving the issue by bouncing the instance last night. After this, PLE has now grown back to the expected 15 hours rather jumping up and down like it has been over the last few days. Tempdb is also cleared down (as expected after a restart), but running the two queries in my original post both the value for the tempdb in use by users from the first query and the sum of the user allocation of tempdb in the second query, now tally, where as before they were way out. Viewing a view definition now works quickly and doesn't cause a drop in PLE.

    So something was definitely wrong and causing stability issues. I feel I will not know the exact cause of this, but will be better prepared if I see things like this again.

    Thanks again for your help.

    The days of waiting for a service pack for SQL Server are long gone I am afraid. Those that do that will come up against issues FAR more often than those who wait for the service pack, IMNSHO.

    As long as you can bounce the server when this issue crops up, it seems like you have found an acceptable workaround. :hehe:

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply