SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Recently Recompiled Resource Hogs

It’s not too uncommon for a query to get a new execution plan that performs a lot worse than it could, and sometimes it’s bad enough to drag the whole server down to a halt.  When it’s something obvious such as a query going from 2 seconds duration to 30 seconds you’ll pick it up on a standard monitoring trace.  That’s not always the case, and the query that hits your server twice a second going from 100ms to 900ms can fly under the radar except for users saying the server isn’t as responsive as it was yesterday.

Before you go all out and throw up an intrusive trace that can slow down your server capturing an unfiltered sample of RPC:Completed and SQL:BatchCompleted to run aggregates on, see what the DMVs can tell you.  If you look in sys.dm_exec_query_stats you can get the total amount of resources used since the last recompilation, which is going to give you pretty much everything you need as long as you’re looking for a query that can stay in the cache.  The only trick is to look at everything in the same context.  Something that did 10,000,000 reads since it was compiled yesterday is nothing compared to something that did 1,000,000 reads since it was compiled 10 minutes ago.  Here’s my answer to this:

SELECT top 10 --WorkerTime_Sec = cast(round(total_worker_time / 1000 / 1000.0, 0) as Int)
    WorkerTimeSec_PerSec =  cast(cast(round(total_worker_time / 1000 / 1000.0, 0) as Int) / (Cast(DateDiff(second, Creation_Time, GetDate()) as DEC(20,2)) + .00000001) as DEC(20,2))
    --, LogicalReads_K = total_logical_reads / 1000
    , LogicalReads_K_PerSec = cast(total_logical_reads / 1000 / (Cast(DateDiff(second, Creation_Time, GetDate())as DEC(20,2)) + .00000001) as DEC(20,2))
    --, ElapsedTime_Sec = cast(round(total_elapsed_time / 1000 / 1000.0, 0) as Int)
    , ElapsedTimeSec_PerSec = cast(cast(round(total_elapsed_time / 1000 / 1000.0, 0) as Int)/ (Cast(DateDiff(second, Creation_Time, GetDate())as DEC(20,2)) + .00000001) as DEC(20,2))
    , Seconds_Since_Recompile = DateDiff(second, Creation_Time, GetDate())
    , Creation_Time
    , qs.execution_count
    , avg_reads = total_logical_reads / qs.execution_count
    , avg_elapsed_ms = total_elapsed_time / 1000 / qs.execution_count
    , avg_worker_ms = total_worker_time / 1000 / qs.execution_count
    , Offset = statement_start_offset
    , ObjectName = OBJECT_NAME(qt.objectid, qt.dbid)
    , qt.text
    , StatementText = SUBSTRING (qt.text, qs.statement_start_offset/2 + 1,
        abs(CASE WHEN qs.statement_end_offset = -1
               THEN LEN(CONVERT(NVARCHAR(MAX), qt.text)) * 2 
            ELSE qs.statement_end_offset END - qs.statement_start_offset)/2 + 1)
    --, qt.*
    --, qs.*
FROM sys.dm_exec_query_stats AS qs
    CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS qt
WHERE Creation_Time > DateAdd(Hour, -24, GetDate())
    AND (creation_time < DATEADD(SECOND, -60, GETDATE()) or execution_count > 5) --Has to be a minute old or executed 6 times before I consider the data stable
    --AND (total_worker_time > 20000 or execution_count > 15000)

I comment out what I don’t typically use, same as always with me.  Those pieces don’t usually change.  The only thing that does typically change is the date filter on the creation time; if someone says there weren’t any problems this time yesterday then it stays at the default value, if they say it worked fine before lunch then it drops down to 4 hours.

There are four things to keep in mind with this DMV.  Everything is in microseconds which I don’t use, so I “fixed” the problem in my query.  It will only hold data while the query is in cache, which means it’s never tracked here if it’s never in cache.  Each row is for a single plan for a single statement, although  you could easily aggregate it if you wanted to.  Also, it’s possible for the worker time or duration per second to be over 1 due to multiple simultaneous runs or, with worker time, parallelism can cause this.  I typically try not to restate what you should be reading on MSDN or BOL, but these aren’t things everyone expects.

For me to run this, I typically have to already have a complaint that a server is running slower than expected. Check out my Server Running Slow post to see everything else that runs along with this.

Filed under: Scripts, System State Tagged: cache, DMV, sys.dm_exec_query_stats

Simple SQL Server

I started out working with Microsoft Access and SQL Server back in 2000 as the only employee doing IT full-time, and worked most of my career where “big fish in a little pond” was an overstatement. Learning is scarce when you do everything and don’t work with anyone who knows more than you. In 2010 I was plunged into the Ocean and grabbed onto anything I could find to stay afloat. I wasn’t going to simply run scripts I didn’t understand, so I learned the DMVs and system tables in the scripts I found and rewrote them all. Now, I know enough where I can start giving back to a community that saved me from drowning.


Leave a comment on the original post [simplesqlserver.com, opens in a new window]

Loading comments...