Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Catching Transient CPU Spikes Using SQL Trace

Catching Transient CPU Spikes Using SQL Trace
 
If there was one thing I wish everyone who works with SQL Server would learn, it would have to be to use server-side SQL Traces instead of the client application SQL Server Profiler. Is Profiler really that bad? On a live production system, yes it is. Let me attach Profiler to any system, and I can bring it to a crawl in a matter of milliseconds.
 
In fact, I do a demo of this very thing in my presentation on using SQL Profiler/Trace Effectively. SQL Trace can be a very effective tool if used properly. My favorite use of this tool is dynamically triggering it to run in response to some event. To demonstrate this, I'm going to share a procedure I wrote to capture a SQL Trace when CPU utlization spikes above a certain threshold.
 
The Setup
 
A couple of years ago, we had a transient issue where CPU utlization would spike from an average of 30-40% to 95-100%. The spike was generally short lived and would disappear in 5 to 10 minutes. To further complicate things, the spikes showed no specific patterns of when they would occur, but the frequency was usually no more than once or twice per week.
 
We suspected that it was a code issue, but we did not want to leave a trace running full time to capture it. I wrote a procedure to check CPU utilization every 5 minutes. If the CPU spiked above the defined threshold, it would start a custom SQL trace running and capture the top 25 queries currently running by CPU time to a table.
 
The Procedure
 
The procedure is commented well and explains what it is doing, so I won't go on about it. And yes, this is production proven safe!!
 

Use master;

Go

 

If Exists (Select 1 from information_schema.routines

                  Where routine_name = 'dba_RunCPUUtilizationTrace'

                  And routine_schema = 'dbo')

      Drop procedure dbo.dba_RunCPUUtilizationTrace

Go

 

Set ANSI_Nulls On

Set Quoted_Identifier On

Go

 

Create procedure dbo.dba_RunCPUUtilizationTrace

      @Threshold int = 75 -- % of CPU utilization to use as a toggle point for turning the trace on and off

As

 

Declare @TraceID int,

      @StartDate nvarchar(35),

      @maxfilesize bigint,

      @on bit,

      @options int,

      @Path nvarchar(200),

      @TraceFile nvarchar(245),

      @TraceFileBaseName nvarchar(25),

      @TraceRunning bit,

      @CPU_Busy int,

      @Idle int,

      @CPUPercent decimal(5, 2)

 

Set NoCount On

 

Set @on = 1

Set @options = 2

Set @StartDate = Convert(nvarchar, getdate(), 112) + Replace(Convert(nvarchar, getdate(), 108), ':', '')

Set @maxfilesize = 5

Set @TraceRunning = 0

Set @TraceFileBaseName = 'CPUSpikeTrace'

 

-- Get current CPU counts

Select @CPU_Busy = @@CPU_BUSY,

      @Idle = @@IDLE

 

-- Delay for 1 second

WaitFor Delay '000:00:01'

 

-- Get current CPU counts and calculate % CPU utilization over the last 1 second

Set @CPUPercent = Cast(Round((@@CPU_BUSY - @CPU_Busy)/((@@IDLE - @Idle + @@CPU_BUSY - @CPU_Busy) * 1.0) * 100, 2) As decimal(5, 2))

 

-- If CPU > Threshold, wait 1 minute and retest.

-- This is to avoid tracing for a momentary spike.

-- Suggested by Sajesh Gopalakrishnan

If @CPUPercent >= @Threshold

  Begin

      -- Delay for 1 minute

      WaitFor Delay '000:01:00'

 

      -- Get new current CPU counts

      Select @CPU_Busy = @@CPU_BUSY,

            @Idle = @@IDLE

 

      -- Delay for 1 second

      WaitFor Delay '000:00:01'

 

      -- Get new current CPU counts and calculate % CPU utilization over the last 1 second

      Set @CPUPercent = Cast(Round((@@CPU_BUSY - @CPU_Busy)/((@@IDLE - @Idle + @@CPU_BUSY - @CPU_Busy) * 1.0) * 100, 2) As decimal(5, 2))

  End

 

-- Get file path of system trace file to use as target path for custom trace

Select @Path = Cast(value as nvarchar(255))

From fn_trace_getinfo(1) -- 1 = default trace

Where property = 2 -- 2 = value is path\name of trace file

 

-- Remove name of trace file from @Path

Set @Path = Reverse(Right(Reverse(@Path), Len(@Path) - CharIndex('\', Reverse(@Path))))

 

If Right(@Path, 1) <> '\'

      Set @Path = @Path + '\'

 

-- Check to see if custom trace already exists

Select @TraceID = traceid

From fn_trace_getinfo(null) -- null = all current traces

Where property = 2 -- 2 = value is path\name of trace file

And Cast(value as nvarchar(245)) Like @Path + @TraceFileBaseName + '%'

 

-- Check to see if custom trace is running (if exists)

If @TraceID Is Not Null

  Begin

      If Exists (Select 1 From fn_trace_getinfo(null) -- null = all current traces

                        Where property = 5 -- 5 = status of trace

                        And Cast(value as int) = 1 -- 1 = trace is active

                        And traceid = @TraceID)

        Begin

            Set @TraceRunning = 1

        End

      Else

        Begin

            -- Close trace and delete trace definition

            Exec sys.sp_trace_setstatus @TraceID, 2

        End

  End

 

-- If CPU below threshold and trace is running, stop the trace

If @CPUPercent < @Threshold And @TraceRunning = 1

  Begin

      Exec sys.sp_trace_setstatus @TraceID, 0 -- Turn trace off

      Exec sys.sp_trace_setstatus @TraceID, 2 -- Close trace and delete trace definition

  End

 

-- If CPU >= threshold and trace is not running, start the trace

If @CPUPercent >= @Threshold And @TraceRunning = 0

  Begin

      Set @TraceID = Null

      Set @TraceFile = @Path + @TraceFileBaseName + @StartDate

 

      -- Create trace definition

      Exec sys.sp_trace_create @traceid = @TraceID OUTPUT,

                                          @options = @options, -- rollover files

                                          @tracefile = @TraceFile,

                                          @maxfilesize = @maxfilesize

 

      -- Set the trace events

      -- These events were generated by SQL Profiler

      -- syntax: Exec sys.sp_trace_setevent @traceid, @eventid, @columnid, @on

      -- @eventid = 10 = RPC:Completed

      -- @columnid = 1 = TextData

      -- @columnid = 12 = SPID

      -- @columnid = 13 = Duration

      -- @columnid = 14 = StartTime

      -- @columnid = 15 = EndTime

      -- @columnid = 16 = Reads

      -- @columnid = 17 = Writes

      -- @columnid = 18 = CPU

      -- @columnid = 34 = ObjectName

      Exec sys.sp_trace_setevent @TraceID, 10, 15, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 16, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 1, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 17, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 18, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 34, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 12, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 13, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 14, @on

 

      -- Set the trace filters

      -- Don't capture sp_reset_connection

      Exec sys.sp_trace_setfilter @TraceID, 1, 0, 7, N'Exec sp_reset_connection'

 

      -- Set the trace status to start

      Exec sys.sp_trace_setstatus @TraceID, 1

 

      If object_id('dbo.TopQueriesByCPU') Is Null

        Begin

            Create Table dbo.TopQueriesByCPU (

                              TrackID int identity(1, 1) not null primary key,

                              TrackDate datetime not null default (getdate()),

                              iTrackDate As (convert(int,convert(varchar,TrackDate,(112)),(0))),

                              iTrackTime As (convert(int,replace(convert(varchar,TrackDate,(108)),':',''),(0))),

                              CPU_Time int not null,

                              DBName sysname null,

                              ObjectName sysname null,

                              QueryText nvarchar(max) null,

                              Reads int null,

                              LogicalReads int null,

                              Writes int null)

        End

 

      -- Capture Top 25 queries by CPU time

      Insert Into dbo.TopQueriesByCPU (

            CPU_Time,

            DBName,

            ObjectName,

            QueryText,

            Reads,

            LogicalReads,

            Writes)

      SELECT TOP 25

            CPU_Time

            , DBName = db_name(qs.database_id)

            , ObjectName = object_name(qt.objectid, qs.database_id)

            , QueryText = SUBSTRING(qt.text, qs.statement_start_offset/2,

                                    (Case When qs.statement_end_offset = -1

                                                Then len(convert(nvarchar(max), qt.text)) * 2

                                          Else qs.statement_end_offset

                                          End - qs.statement_start_offset)/2)

            ,qs.reads

            ,qs.logical_reads

            ,qs.writes

      FROM sys.dm_exec_requests qs

      Cross Apply sys.dm_exec_sql_text(qs.sql_handle) as qt

      ORDER BY CPU_Time Desc, qs.logical_reads Desc;

  End

Set NoCount Off

Go

 

Comments

Posted by Anonymous on 7 April 2010

This post was mentioned on Twitter by SQLSoldier: [Blogged] Catching Transient CPU Spikes Using SQL Trace: http://bit.ly/bxoZ0E #sql #sqlserver #sqlgeek

Posted by Steve Jones on 7 April 2010

Very nice. I could see this as a handy template gathering information for any number of issues that you want to trace. I'd love to see a full fledged article writeup on how you use this to solve an issue.

Posted by Jason Brimhall on 7 April 2010

Very nice indeed.  Thank you for sharing this information.

Posted by David George (tm) on 23 April 2010

Great script! I've notied the CPUPercent often returns 0.00 - 13.00, when in fact the CPU may be 30% or even 90%+?

Posted by Robert Davis on 29 April 2010

Hi dmathis. What you are measuring with this script is the amount of CPU that SQL Server is using. So if this script is returning low CPU values and your CPU utilization is actually high, then it is not SQL Server that is consuming it all. In this case, you definitely don't need to trace the SQL Server because SQL is not your issue.

Posted by Anonymous on 29 April 2010

Pingback from  Baby Bottle Travel Steam Sterilizers | Baby Travel Systems

Posted by Dean Jones on 18 June 2010

I tested out this script by writing a query i know takes up CPU resources, i ussed SQLCMD on various screen prompts and called the stored procedure in a long lasting loop.

What i noticed was that CPU was constantly pegged at 100% and yet the script is not reporting such figures.

Thanks.

Posted by John on 18 August 2010

I had started a trace using the above script, but now I am unable to stop it, can you please let me know how I can stop this trace?

Posted by Einar Gudjonsson on 14 September 2010

Hi,

This is a great script which we've used for a few months now and it has helped us pin-pointing some nasty queries that were cripling our server from time to time.

Today we had such an incident and I needed to run the script, but this time I got the error message: "Arithmetic overflow occurred", five times. And nothing is recorded in the trace table.

Any one had that same problem with this?

Posted by andrewleihsing on 17 April 2012

Does anyone know if this script works on sql 2008 and/or server 2008 R2.  Getting odd cpu perf. % results, think there is a diff. way to get cpu % util.  Not sure, figured I ask the guru's.

Thanks in advance.

Posted by sunmoon.qa on 19 April 2013

after i ran the script I did not get an output.

what is the procedure to run this script?

please advise. thanks in advance

Leave a Comment

Please register or log in to leave a comment.