Blog Post

Benchmark SQL Server Wait Statistics

,

One of the secret weapons in performance tuning with SQL Server is understanding wait statistics. Every time a process (spid) changes its status from running to suspended it is typically due to resource wait or a signal wait. For example, you are typically waiting on PAGEIOLATCH_SH when you run a query that needs to pull data from disk into memory for a select statement. Understanding wait statistics can be a very helpful tool in your tool-belt if you are brought in to troubleshoot a critical issue due to slow performance. Wait statistics can quickly eliminate several resources that aren’t causing your performance problem and allow you to focus on the root cause.

** Download all scripts here **

How Do We Get Wait Statistics?

Starting in SQL Server 2005 we were given access to dynamic management views and functions. One that is very helpful with collecting waits is sys.dm_os_wait_stats.  You can access your top waits by using the following query.

SELECT TOP 10 *
FROM sys.dm_os_wait_stats
where wait_time_ms > 0
order by wait_time_ms desc

This DMV collects waits from the time the instance starts unless its cleared with the following DBBC command below. I strongly recommend that you grab snapshots of this view and compare instead of purging the history. Once you run the DBCC SQLPERF command you lose all history of your waits. We will talk a little bit more about this in the next section.

DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR)

Mistakes with Wait Statistics

Typically in the field I have seen two different mistakes with Wait Statistics. Both mistakes are fixable. The first is an easy fix, the second one takes some knowledge to prevent knee jerk reactions that can cause more problems.

The first mistake I see is people constantly running DBCC SQLPERF to clear out the sys.dm_os_wait_stats to get a starting point. Typically, one would clear out all the waits wait a little bit and use a great query like Paul Randal’s – tell me why it hurts script or Glen Berry’s DMV scripts  to see what waits occurred in the last few seconds. Logically, this seems okay but you just dumped your history of wait statistics. You cannot go back and see what were your top resource waits over time.  I strongly recommend capturing a snapshot of this view, waiting a few seconds and then capturing it again and then compare the two captures. This way you still have your history.

The second mistake is not understanding the wait types or making bad decisions based on the top wait types.  Are there good waits types? What reactions should I take due to the wait types? These are both very good questions and we cover them at a basic view in this post. First, thing I recommend doing before running any wait stats queries is get a good reference list of waits types. I personally recommend MS Whitepaper on Waits and Queues and The SQL Server Wait Type Repository by CSS SQL Server Engineers. These can be very helpful when you need to understand your top waits or build an action plan behind your waits.

Let’s take a look at the TOP 10 query result set from the query above.
Top WaitsIf you look at the top four waits in CSS SQL Server Engineers blog you will notice that for various reasons it is safe to skip these wait types. Therefore, when were collecting waits we would want to bypass waits that are not indication of resource pressure actionable.

Finally, you need to be careful about the advice on the internet. It can be like taking candy from strangers. There still quite a bit of misleading advice that can cause more harm than good. An easy example is CXPACKET waits. There are articles that recommend setting Max Degree of Parallelism to 1 for the instance. This would actually be the last thing I would want to do by just seeing CXPACKET as a top wait type.  I would want to find the queries causing  CXPACKET waits see if they are effecting the business  and tune them if needed. If that isn’t possible than I would consider this recommendation.

Capture Wait Statisitcs

So, now if you are still reading you have read about why understanding wait statistics is a good thing, basics of collecting wait statistics, and common mistakes with wait statistics. Lets jump to how I go about collecting wait statistics.

The following script will utilize three temp tables. One for a snapshot of wait statistics and another for comparing waits, and the last table for wait types we would want to skip due to the waits being background processes or waits that could safely be ignored. It will get a delta of your wait statistics for five minutes.

/* Create temp tables to capture wait stats to compare */IF OBJECT_ID('tempdb..#WaitStatsBench') IS NOT NULL
DROP TABLE #WaitStatsBench
IF OBJECT_ID('tempdb..#WaitStatsFinal') IS NOT NULL
DROP TABLE #WaitStatsFinal
IF OBJECT_ID('tempdb..#BenignWaits') IS NOT NULL
DROP TABLE #BenignWaits
CREATE TABLE #WaitStatsBench (WaitType varchar(200), wait_S decimal(12,5), Resource_S decimal (12,5), Signal_S decimal (12,5), WaitCount bigint)
CREATE TABLE #WaitStatsFinal (WaitType varchar(200), wait_S decimal(12,5), Resource_S decimal (12,5), Signal_S decimal (12,5), WaitCount bigint)
CREATE TABLE #BenignWaits (WaitType varchar(200))
/* Insert Benign Waits */INSERT INTO #BenignWaits (WaitType)
VALUES ('CLR_SEMAPHORE')
INSERT INTO #BenignWaits (WaitType)
VALUES ('LAZYWRITER_SLEEP')
INSERT INTO #BenignWaits (WaitType)
VALUES ('RESOURCE_QUEUE')
INSERT INTO #BenignWaits (WaitType)
VALUES ('SLEEP_TASK')
INSERT INTO #BenignWaits (WaitType)
VALUES ('SLEEP_SYSTEMTASK')
INSERT INTO #BenignWaits (WaitType)
VALUES ('SQLTRACE_BUFFER_FLUSH')
INSERT INTO #BenignWaits (WaitType)
VALUES ('WAITFOR')
INSERT INTO #BenignWaits (WaitType)
VALUES ('LOGMGR_QUEUE')
INSERT INTO #BenignWaits (WaitType)
VALUES ('CHECKPOINT_QUEUE')
INSERT INTO #BenignWaits (WaitType)
VALUES ('REQUEST_FOR_DEADLOCK_SEARCH')
INSERT INTO #BenignWaits (WaitType)
VALUES ('XE_TIMER_EVENT')
INSERT INTO #BenignWaits (WaitType)
VALUES ('BROKER_TO_FLUSH')
INSERT INTO #BenignWaits (WaitType)
VALUES ('BROKER_TASK_STOP')
INSERT INTO #BenignWaits (WaitType)
VALUES ('CLR_MANUAL_EVENT')
INSERT INTO #BenignWaits (WaitType)
VALUES ('CLR_AUTO_EVENT')
INSERT INTO #BenignWaits (WaitType)
VALUES ('DISPATCHER_QUEUE_SEMAPHORE')
INSERT INTO #BenignWaits (WaitType)
VALUES ('FT_IFTS_SCHEDULER_IDLE_WAIT')
INSERT INTO #BenignWaits (WaitType)
VALUES ('XE_DISPATCHER_WAIT')
INSERT INTO #BenignWaits (WaitType)
VALUES ('XE_DISPATCHER_JOIN')
INSERT INTO #BenignWaits (WaitType)
VALUES ('BROKER_EVENTHANDLER')
INSERT INTO #BenignWaits (WaitType)
VALUES ('TRACEWRITE')
INSERT INTO #BenignWaits (WaitType)
VALUES ('FT_IFTSHC_MUTEX')
INSERT INTO #BenignWaits (WaitType)
VALUES ('SQLTRACE_INCREMENTAL_FLUSH_SLEEP')
INSERT INTO #BenignWaits (WaitType)
VALUES ('BROKER_RECEIVE_WAITFOR')
INSERT INTO #BenignWaits (WaitType)
VALUES ('ONDEMAND_TASK_QUEUE')
INSERT INTO #BenignWaits (WaitType)
VALUES ('DBMIRROR_EVENTS_QUEUE')
INSERT INTO #BenignWaits (WaitType)
VALUES ('DBMIRRORING_CMD')
INSERT INTO #BenignWaits (WaitType)
VALUES ('BROKER_TRANSMITTER')
INSERT INTO #BenignWaits (WaitType)
VALUES ('SQLTRACE_WAIT_ENTRIES')
INSERT INTO #BenignWaits (WaitType)
VALUES ('SLEEP_BPOOL_FLUSH')
INSERT INTO #BenignWaits (WaitType)
VALUES ('SQLTRACE_LOCK')
INSERT INTO #BenignWaits (WaitType)
VALUES ('DIRTY_PAGE_POLL')
INSERT INTO #BenignWaits (WaitType)
VALUES ('SP_SERVER_DIAGNOSTICS_SLEEP')
INSERT INTO #BenignWaits (WaitType)
VALUES ('HADR_FILESTREAM_IOMGR_IOCOMPLETION')
GO
/* Get baseline */INSERT INTO #WaitStatsBench (WaitType, wait_S, Resource_S, Signal_S, WaitCount)
SELECT
wait_type,
wait_time_ms / 1000.0 AS WaitS,
(wait_time_ms - signal_wait_time_ms) / 1000.0 AS ResourceS,
signal_wait_time_ms / 1000.0 AS SignalS,
waiting_tasks_count AS WaitCount
FROM sys.dm_os_wait_stats
WHERE wait_type not in (select BW.WaitType from #BenignWaits BW)
AND wait_time_ms > 0
/**** Wait some time... ******/DECLARE @WaitSeconds int = 300
WAITFOR DELAY @WaitSeconds
GO
INSERT INTO #WaitStatsFinal (WaitType, wait_S, Resource_S, Signal_S, WaitCount)
SELECT
wait_type,
wait_time_ms / 1000.0 AS WaitS,
(wait_time_ms - signal_wait_time_ms) / 1000.0 AS ResourceS,
signal_wait_time_ms / 1000.0 AS SignalS,
waiting_tasks_count AS WaitCount
FROM sys.dm_os_wait_stats
WHERE wait_type not in (select BW.WaitType from #BenignWaits BW)
AND wait_time_ms > 0
SELECT f.WaitType,
f.wait_S - b.wait_S as Wait_S,
f.Resource_S - b.Resource_S as Resource_S,
f.Signal_S - b.Signal_S as Signal_S,
f.WaitCount - b.WaitCount as WaitCounts,
CAST(CASE WHEN f.WaitCount - b.WaitCount = 0 THEN 0 ELSE (f.wait_S - b.wait_S) / (f.WaitCount - b.WaitCount) END AS numeric(10, 6))AS Avg_Wait_S,
CAST(CASE WHEN f.WaitCount - b.WaitCount = 0 THEN 0 ELSE (f.Resource_S - b.Resource_S) / (f.WaitCount - b.WaitCount) END AS numeric(10, 6))AS Avg_Resource_S,
CAST(CASE WHEN f.WaitCount - b.WaitCount = 0 THEN 0 ELSE (f.Signal_S - b.Signal_S) / (f.WaitCount - b.WaitCount) END AS numeric(10, 6))AS Avg_Signal_S
FROM #WaitStatsFinal f
LEFT JOIN #WaitStatsBench b ON (f.WaitType = b.WaitType)
WHERE (f.wait_S - b.wait_S) > 0
ORDER BY 2 desc, 4 desc

Benchmark Wait Statistics

Finally, I will include the code used for creating a stored procedure that will give you control on how long the wait statistics capture runs and how often we should collect during the capture process.

CREATE SCHEMA Waits
Go
CREATE TABLE Waits.WaitStats (CaptureDataID bigint, WaitType varchar(200), wait_S decimal(20,5), Resource_S decimal (20,5), Signal_S decimal (20,5), WaitCount bigint, Avg_Wait_S numeric(10, 6), Avg_Resource_S numeric(10, 6),Avg_Signal_S numeric(10, 6), CaptureDate datetime)
CREATE TABLE Waits.BenignWaits (WaitType varchar(200))
CREATE TABLE Waits.CaptureData (
ID bigint identity PRIMARY KEY,
StartTime datetime,
EndTime datetime,
ServerName varchar(500),
PullPeriod int
)
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('CLR_SEMAPHORE')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('LAZYWRITER_SLEEP')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('RESOURCE_QUEUE')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('SLEEP_TASK')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('SLEEP_SYSTEMTASK')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('SQLTRACE_BUFFER_FLUSH')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('WAITFOR')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('LOGMGR_QUEUE')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('CHECKPOINT_QUEUE')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('REQUEST_FOR_DEADLOCK_SEARCH')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('XE_TIMER_EVENT')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('BROKER_TO_FLUSH')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('BROKER_TASK_STOP')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('CLR_MANUAL_EVENT')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('CLR_AUTO_EVENT')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('DISPATCHER_QUEUE_SEMAPHORE')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('FT_IFTS_SCHEDULER_IDLE_WAIT')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('XE_DISPATCHER_WAIT')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('XE_DISPATCHER_JOIN')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('BROKER_EVENTHANDLER')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('TRACEWRITE')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('FT_IFTSHC_MUTEX')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('SQLTRACE_INCREMENTAL_FLUSH_SLEEP')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('BROKER_RECEIVE_WAITFOR')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('ONDEMAND_TASK_QUEUE')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('DBMIRROR_EVENTS_QUEUE')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('DBMIRRORING_CMD')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('BROKER_TRANSMITTER')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('SQLTRACE_WAIT_ENTRIES')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('SLEEP_BPOOL_FLUSH')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('SQLTRACE_LOCK')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('DIRTY_PAGE_POLL')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('SP_SERVER_DIAGNOSTICS_SLEEP')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('HADR_FILESTREAM_IOMGR_IOCOMPLETION')
INSERT INTO Waits.BenignWaits (WaitType)
VALUES ('HADR_WORK_QUEUE')
GO
--DROP PROCEDURE Waits.GetWaitStats
CREATE PROCEDURE Waits.GetWaitStats
@WaitTimeSec INT = 60,
@StopTime DATETIME = NULL
AS
BEGIN
DECLARE @CaptureDataID int
/* Create temp tables to capture wait stats to compare */IF OBJECT_ID('tempdb..#WaitStatsBench') IS NOT NULL
DROP TABLE #WaitStatsBench
IF OBJECT_ID('tempdb..#WaitStatsFinal') IS NOT NULL
DROP TABLE #WaitStatsFinal
CREATE TABLE #WaitStatsBench (WaitType varchar(200), wait_S decimal(20,5), Resource_S decimal (20,5), Signal_S decimal (20,5), WaitCount bigint)
CREATE TABLE #WaitStatsFinal (WaitType varchar(200), wait_S decimal(20,5), Resource_S decimal (20,5), Signal_S decimal (20,5), WaitCount bigint)
DECLARE @ServerName varchar(300)
SELECT @ServerName = convert(nvarchar(128), serverproperty('servername'))
/* Insert master record for capture data */INSERT INTO Waits.CaptureData (StartTime, EndTime, ServerName,PullPeriod)
VALUES (GETDATE(), NULL, @ServerName, @WaitTimeSec)
SELECT @CaptureDataID = SCOPE_IDENTITY()
/* Loop through until time expires */IF @StopTime IS NULL
SET @StopTime = DATEADD(hh, 1, getdate())
WHILE GETDATE() < @StopTime
BEGIN
/* Get baseline */INSERT INTO #WaitStatsBench (WaitType, wait_S, Resource_S, Signal_S, WaitCount)
SELECT
wait_type,
wait_time_ms / 1000.0 AS WaitS,
(wait_time_ms - signal_wait_time_ms) / 1000.0 AS ResourceS,
signal_wait_time_ms / 1000.0 AS SignalS,
waiting_tasks_count AS WaitCount
FROM sys.dm_os_wait_stats
WHERE wait_time_ms > 0
AND wait_type NOT IN ( SELECT WaitType FROM Waits.BenignWaits)
/* Wait a few minutes and get final snapshot */WAITFOR DELAY @WaitTimeSec;
INSERT INTO #WaitStatsFinal (WaitType, wait_S, Resource_S, Signal_S, WaitCount)
SELECT
wait_type,
wait_time_ms / 1000.0 AS WaitS,
(wait_time_ms - signal_wait_time_ms) / 1000.0 AS ResourceS,
signal_wait_time_ms / 1000.0 AS SignalS,
waiting_tasks_count AS WaitCount
FROM sys.dm_os_wait_stats
WHERE wait_time_ms > 0
AND wait_type NOT IN ( SELECT WaitType FROM Waits.BenignWaits)
DECLARE @CaptureTime datetime
SET @CaptureTime = getdate()
INSERT INTO Waits.WaitStats (CaptureDataID, WaitType, Wait_S, Resource_S, Signal_S, WaitCount, Avg_Wait_S, Avg_Resource_S,Avg_Signal_S, CaptureDate)
SELECT @CaptureDataID,
f.WaitType,
f.wait_S - b.wait_S as Wait_S,
f.Resource_S - b.Resource_S as Resource_S,
f.Signal_S - b.Signal_S as Signal_S,
f.WaitCount - b.WaitCount as WaitCounts,
CAST(CASE WHEN f.WaitCount - b.WaitCount = 0 THEN 0 ELSE (f.wait_S - b.wait_S) / (f.WaitCount - b.WaitCount) END AS numeric(10, 6))AS Avg_Wait_S,
CAST(CASE WHEN f.WaitCount - b.WaitCount = 0 THEN 0 ELSE (f.Resource_S - b.Resource_S) / (f.WaitCount - b.WaitCount) END AS numeric(10, 6))AS Avg_Resource_S,
CAST(CASE WHEN f.WaitCount - b.WaitCount = 0 THEN 0 ELSE (f.Signal_S - b.Signal_S) / (f.WaitCount - b.WaitCount) END AS numeric(10, 6))AS Avg_Signal_S,
@CaptureTime
FROM #WaitStatsFinal f
LEFT JOIN #WaitStatsBench b ON (f.WaitType = b.WaitType)
TRUNCATE TABLE #WaitStatsBench
TRUNCATE TABLE #WaitStatsFinal
END -- END of WHILE
/* Update Capture Data meta-data to include end time */UPDATE Waits.CaptureData
SET EndTime = GETDATE()
WHERE ID = @CaptureDataID
END

Now, you can run the following block of code to capture your wait statistics.

DECLARE @EndTime datetime, @WaitSeconds int
SELECT @EndTime = DATEADD(MINUTE, 30, getdate()),
@WaitSeconds = 30
EXEC Waits.GetWaitStats
@WaitTimeSec = @WaitSeconds,
@StopTime = @EndTime

If you liked this blog post check out my next one where I focus on finding the queries that caused my waits.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating