Identifying Querys causing CPU spike

  • Hi Everyone,

    I have a situation where the CPU on my data warehouse server was pegged at 100% for approx. 10 minutes yesterday at 4:00pm. Is there anyway of finding out what sql was running against the database at that particular time yesterday? I know there are DMV's that will show you the top CPU consumers, but I want to be able to prove those querys were running at that specific time.

  • Take a look at Brent Ozars article on "playing doctor".

    It lists a bunch of free scripts out there to help identify problems such as high CPU.

    http://www.brentozar.com/sql-server-training-videos/playing-doctor-with-dmvs/

    Take a look at the script from Glenn Barry. It has a ton of diagnosis queries. There are two queries you want to search for. I enclosed just the description since I Glenn did all the work.

    A - CPU

    -- This helps you find the most expensive cached stored procedures from a CPU perspective

    -- You should look at this if you see signs of CPU pressure

    B - MEMORY

    -- This helps you find the most expensive cached stored procedures from a memory perspective

    -- You should look at this if you see signs of memory pressure

    Good luck on finding your issue.

    John Miner
    Crafty DBA
    www.craftydba.com

  • The problem with DMV analysis as well as trace events, is that you can't find out which specific procedures/statements caused a spike at a certain time. The trace have to be running on the production server at the time to capture these.

    It is possible that some of these events might have been caught with the system_health extended event session that's running by default, but not if it was only a standard CPU consuming query. It's worth to have a look anyway.

    I would look at the sys.dm_exec_query_stats and get the average of cpu time per execution and try to find out the possible culprtis (total_worker_time/execution_count), and set up an extended event session on a test server to try to reproduce results, or set up a light weight session on production if you monitor it closely.

    Since the introduction of the much more flexible and lightweight extended events I try to stay away from traces(most are avalable in XE) , and they will probably be phased out anyway. Look at http://sqlblog.com/blogs/jonathan_kehayias/archive/2010/12/01/an-xevent-a-day-1-31-an-overview-of-extended-events.aspx for an excelent overview..

  • Hi Sharky,

    I do not totally agree with you!

    A - Did you even look at Glenn Barry's scripts?

    -- Top Cached SPs By Total Worker time (SQL 2008). Worker time relates to CPU cost

    -- Top Cached SPs By Total Logical Reads (SQL 2008). Logical reads relate to memory pressure

    -- Top Cached SPs By Total Logical Writes (SQL 2008). Logical writes relate to both memory and disk I/O pressure

    All of these are based upon the 'sys.dm_exec_procedure_stats' DMV. While this will not tell you all the spids that are running at the time of the CPU spike, it will tell you the problem queries that are sucking up CPU, Memory & Disk. By improving these queries, the overall performance should increase.

    B - A tool like SQL Sentry Performance monitor and event manager will find the needle in the hay stack at $1500-1995 dollars per server.

    http://www.sqlsentry.com/solutions-sql-server.asp

    C - A caution about extended events and SQL Server traces. Both do occur an overhead. A nice article by Jonathan Kehayias comparing the two.

    http://www.sqlperformance.com/2012/10/sql-trace/observer-overhead-trace-extended-events

    D - Even if you have a SQL trace file, you still have to weed through all the connections (spids) and queries running at that time.

    I suggest trying out Clear Trace to weed through the hay. http://www.sqlmag.com/article/performance/cleartrace

    In short, I do not think a trace will show the smoking gun a very busy system without a bunch of analysis.

    John

    John Miner
    Crafty DBA
    www.craftydba.com

  • Hi John,

    Dont disagree with you. Glen's DMV scripts is still the benchmark. Fair point about looking at memory consumers as well, as they might have a secondary CPU spike effect.

    His problem was just that he knew about the DMV's but wanted to find out a specific without a trace or XE set and thats not possible. That's why I suggest looking at the default system_health XE. As you said, XE although more lightweight will still require overhead(as would an external tool). Better to use it on a test server.

    Regards

    Stan

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply