Blog Post

Catching Transient CPU Spikes Using SQL Trace

,

Catching Transient CPU Spikes Using SQL Trace

 

If there was one thing I wish everyone who works with SQL Server would learn, it would have to be to use server-side SQL Traces instead of the client application SQL Server Profiler. Is Profiler really that bad? On a live production system, yes it is. Let me attach Profiler to any system, and I can bring it to a crawl in a matter of milliseconds.

 

In fact, I do a demo of this very thing in my presentation on using SQL Profiler/Trace Effectively. SQL Trace can be a very effective tool if used properly. My favorite use of this tool is dynamically triggering it to run in response to some event. To demonstrate this, I'm going to share a procedure I wrote to capture a SQL Trace when CPU utlization spikes above a certain threshold.

 

The Setup

 

A couple of years ago, we had a transient issue where CPU utlization would spike from an average of 30-40% to 95-100%. The spike was generally short lived and would disappear in 5 to 10 minutes. To further complicate things, the spikes showed no specific patterns of when they would occur, but the frequency was usually no more than once or twice per week.

 

We suspected that it was a code issue, but we did not want to leave a trace running full time to capture it. I wrote a procedure to check CPU utilization every 5 minutes. If the CPU spiked above the defined threshold, it would start a custom SQL trace running and capture the top 25 queries currently running by CPU time to a table.

 

The Procedure

 

The procedure is commented well and explains what it is doing, so I won't go on about it. And yes, this is production proven safe!!

 

Use master;

Go

 

If Exists (Select 1 from information_schema.routines

                  Where routine_name = 'dba_RunCPUUtilizationTrace'

                  And routine_schema = 'dbo')

      Drop procedure dbo.dba_RunCPUUtilizationTrace

Go

 

Set ANSI_Nulls On

Set Quoted_Identifier On

Go

 

Create procedure dbo.dba_RunCPUUtilizationTrace

      @Threshold int = 75 -- % of CPU utilization to use as a toggle point for turning the trace on and off

As

 

Declare @TraceID int,

      @StartDate nvarchar(35),

      @maxfilesize bigint,

      @on bit,

      @options int,

      @Path nvarchar(200),

      @TraceFile nvarchar(245),

      @TraceFileBaseName nvarchar(25),

      @TraceRunning bit,

      @CPU_Busy int,

      @Idle int,

      @CPUPercent decimal(5, 2)

 

Set NoCount On

 

Set @on = 1

Set @options = 2

Set @StartDate = Convert(nvarchar, getdate(), 112) + Replace(Convert(nvarchar, getdate(), 108), ':', '')

Set @maxfilesize = 5

Set @TraceRunning = 0

Set @TraceFileBaseName = 'CPUSpikeTrace'

 

-- Get current CPU counts

Select @CPU_Busy = @@CPU_BUSY,

      @Idle = @@IDLE

 

-- Delay for 1 second

WaitFor Delay '000:00:01'

 

-- Get current CPU counts and calculate % CPU utilization over the last 1 second

Set @CPUPercent = Cast(Round((@@CPU_BUSY - @CPU_Busy)/((@@IDLE - @Idle + @@CPU_BUSY - @CPU_Busy) * 1.0) * 100, 2) As decimal(5, 2))

 

-- If CPU > Threshold, wait 1 minute and retest.

-- This is to avoid tracing for a momentary spike.

-- Suggested by Sajesh Gopalakrishnan

If @CPUPercent >= @Threshold

  Begin

      -- Delay for 1 minute

      WaitFor Delay '000:01:00'

 

      -- Get new current CPU counts

      Select @CPU_Busy = @@CPU_BUSY,

            @Idle = @@IDLE

 

      -- Delay for 1 second

      WaitFor Delay '000:00:01'

 

      -- Get new current CPU counts and calculate % CPU utilization over the last 1 second

      Set @CPUPercent = Cast(Round((@@CPU_BUSY - @CPU_Busy)/((@@IDLE - @Idle + @@CPU_BUSY - @CPU_Busy) * 1.0) * 100, 2) As decimal(5, 2))

  End

 

-- Get file path of system trace file to use as target path for custom trace

Select @Path = Cast(value as nvarchar(255))

From fn_trace_getinfo(1) -- 1 = default trace

Where property = 2 -- 2 = value is path\name of trace file

 

-- Remove name of trace file from @Path

Set @Path = Reverse(Right(Reverse(@Path), Len(@Path) - CharIndex('\', Reverse(@Path))))

 

If Right(@Path, 1) <> '\'

      Set @Path = @Path + '\'

 

-- Check to see if custom trace already exists

Select @TraceID = traceid

From fn_trace_getinfo(null) -- null = all current traces

Where property = 2 -- 2 = value is path\name of trace file

And Cast(value as nvarchar(245)) Like @Path + @TraceFileBaseName + '%'

 

-- Check to see if custom trace is running (if exists)

If @TraceID Is Not Null

  Begin

      If Exists (Select 1 From fn_trace_getinfo(null) -- null = all current traces

                        Where property = 5 -- 5 = status of trace

                        And Cast(value as int) = 1 -- 1 = trace is active

                        And traceid = @TraceID)

        Begin

            Set @TraceRunning = 1

        End

      Else

        Begin

            -- Close trace and delete trace definition

            Exec sys.sp_trace_setstatus @TraceID, 2

        End

  End

 

-- If CPU below threshold and trace is running, stop the trace

If @CPUPercent < @Threshold And @TraceRunning = 1

  Begin

      Exec sys.sp_trace_setstatus @TraceID, 0 -- Turn trace off

      Exec sys.sp_trace_setstatus @TraceID, 2 -- Close trace and delete trace definition

  End

 

-- If CPU >= threshold and trace is not running, start the trace

If @CPUPercent >= @Threshold And @TraceRunning = 0

  Begin

      Set @TraceID = Null

      Set @TraceFile = @Path + @TraceFileBaseName + @StartDate

 

      -- Create trace definition

      Exec sys.sp_trace_create @traceid = @TraceID OUTPUT,

                                          @options = @options, -- rollover files

                                          @tracefile = @TraceFile,

                                          @maxfilesize = @maxfilesize

 

      -- Set the trace events

      -- These events were generated by SQL Profiler

      -- syntax: Exec sys.sp_trace_setevent @traceid, @eventid, @columnid, @on

      -- @eventid = 10 = RPC:Completed

      -- @columnid = 1 = TextData

      -- @columnid = 12 = SPID

      -- @columnid = 13 = Duration

      -- @columnid = 14 = StartTime

      -- @columnid = 15 = EndTime

      -- @columnid = 16 = Reads

      -- @columnid = 17 = Writes

      -- @columnid = 18 = CPU

      -- @columnid = 34 = ObjectName

      Exec sys.sp_trace_setevent @TraceID, 10, 15, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 16, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 1, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 17, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 18, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 34, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 12, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 13, @on

      Exec sys.sp_trace_setevent @TraceID, 10, 14, @on

 

      -- Set the trace filters

      -- Don't capture sp_reset_connection

      Exec sys.sp_trace_setfilter @TraceID, 1, 0, 7, N'Exec sp_reset_connection'

 

      -- Set the trace status to start

      Exec sys.sp_trace_setstatus @TraceID, 1

 

      If object_id('dbo.TopQueriesByCPU') Is Null

        Begin

            Create Table dbo.TopQueriesByCPU (

                              TrackID int identity(1, 1) not null primary key,

                              TrackDate datetime not null default (getdate()),

                              iTrackDate As (convert(int,convert(varchar,TrackDate,(112)),(0))),

                              iTrackTime As (convert(int,replace(convert(varchar,TrackDate,(108)),':',''),(0))),

                              CPU_Time int not null,

                              DBName sysname null,

                              ObjectName sysname null,

                              QueryText nvarchar(max) null,

                              Reads int null,

                              LogicalReads int null,

                              Writes int null)

        End

 

      -- Capture Top 25 queries by CPU time

      Insert Into dbo.TopQueriesByCPU (

            CPU_Time,

            DBName,

            ObjectName,

            QueryText,

            Reads,

            LogicalReads,

            Writes)

      SELECT TOP 25

            CPU_Time

            , DBName = db_name(qs.database_id)

            , ObjectName = object_name(qt.objectid, qs.database_id)

            , QueryText = SUBSTRING(qt.text, qs.statement_start_offset/2,

                                    (Case When qs.statement_end_offset = -1

                                                Then len(convert(nvarchar(max), qt.text)) * 2

                                          Else qs.statement_end_offset

                                          End - qs.statement_start_offset)/2)

            ,qs.reads

            ,qs.logical_reads

            ,qs.writes

      FROM sys.dm_exec_requests qs

      Cross Apply sys.dm_exec_sql_text(qs.sql_handle) as qt

      ORDER BY CPU_Time Desc, qs.logical_reads Desc;

  End

Set NoCount Off

Go

 

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating