When it comes to troubleshooting in relational databases, there’s no better place to start than wait statistics. In a nutshell, a wait statistic is an internal counter that tells you how long the database spent waiting for a particular resource, activity, or process. Since wait statistics are categorized by type, one look will quickly tell the variety of problem that needs your attention, assuming you know meaning for Microsoft’s lingo for each wait type.
(As an aside, wait stats were implemented in the earliest days of relational database computing because the early RDBMSes ran on multiple operating systems. So the database vendors needed a reliable method of troubleshooting database performance which was independent of the OS).
Wait stats help you clue in to the best approach and path for troubleshooting. For example, if your top wait stat showed a lot of time spent on acquiring locks, you could pretty well rest assured that trying to tune networking latency would be a total waste of your time. Conversely, wait stats could also help you understand that perhaps the ‘usual suspects’ of poor database performance (IO, CPU, etc) weren’t actually worth consideration.
Because wait statistics are broad, they’re typically where you start your troubleshooting process but not where it ends. That’s because wait statistics don’t actually point to the smoking gun that’s causing the true performance problem. For example, let’s say your SQL Server is experiencing an unusually high amount of RESOURCE_SEMAPHORE waits. Resource semaphores relate to query compilation and reserving memory for queries as they’re being compiled. But on a busy system,WHICH of your queries are contributing to this wait? It can require a lot more investigation to figure that part out. (I’ve considered spending some more time in future posts walking through the entire troubleshooting process. If you’re interested let me know in the comments).
Now, in SQL Server, it is possible to determine the wait stats accrued by a given thread or even a specific query or transaction. But this information is only retained by SQL Server while “in flight”. Long-term retention of wait stat information is only for the broad categories.
WAIT STATS THEN…
For some historical perspective, you have to go back to the white paper SQL Server 2005 Waits and Queues (By Tom Davidson) to see where it all began for SQL Server. (http://download.microsoft.com/download/4/7/a/47a548b9-249e-484c-abd7-29f31282b04d/Performance_Tuning_Waits_Queues.doc)
Prior to SQL Server 2005, wait stats of a sort where identifiable using the DBCC SQLPERF(UMSSTATS) and DBCC SQLPERF(WAITSTATS). These commands are still around, btw. You can see these early indications of UMSStats (User Mode Scheduler) and wait stats in Microsoft KB articles like Description of WAITTYPE and LASTWAITTYPE (http://support.microsoft.com/kb/822101) and other early blog posts.
Wait Stats started to come into major prominence when folks like Joe Sack (blog | twitter)
(http://blogs.msdn.com/jimmymay/archive/2009/04/27/wait-stats-by-joe-sack.aspx) and Jimmy May (blog | twitter) started to write about them (http://blogs.msdn.com/b/jimmymay/archive/2009/04/26/wait-stats-introductory-references.aspx)
And if you didn’t get them then, you definitely need the SQL Server Diagnostic Queries by Glenn Berry (blog | twitter) , which have a number of wait stat queries already written for you. Glenn started this collection of queries back in 2005 and has kept it up to date ever since. The latest version of queries are at http://sqlserverperformance.wordpress.com/2012/07/08/sql-server-2012-diagnostic-information-queries-july-2012/.
I also started to put a lot of attention on them, such as when MCM and UK MVP Christian Bolton (Blog | Twitter) and I did the webcast The 5-Minute SQL Server Health Check (http://sqlblogcasts.com/blogs/christian/archive/2009/11/16/webcast-now-on-demand-the-5-minute-sql-server-healthcheck.aspx).
…AND WAIT STATS NOW
Nowadays, wait stats are quite well documented. You can get a great review of all of the wait stats for SQL Server simply by looking at the Books Online (BOL) topic (http://msdn.microsoft.com/en-us/library/ms179984.aspx).
The downside of Microsoft’s documentation in BOL is that it tells you a nice bit of info about each of the wait stats, but not how to remediate them if they are turning into a problem on your SQL Server. But times have changed – there’s so much good information that all you need (most of the time) is to perform an internet search for ‘SQL Server my_problem_wait_stat’ and you’ll probably get at least one good hit by Microsoft customer support or an MVP blogger. All you need to do before the search is to find the type of wait stat that’s causing the problem.
These days, all you need to
solve a wait stat problem is an
internet search for ‘SQL Server
There are also a couple good books and eBooks on the topic. Kalen Delaney’s Inside SQL Server books, especially Chapter 2 in the edition sitting on my shelf, are outstanding. Joes2Pros also has a nice, succinct book on wait stats here (http://joes2pros.com/?wpsc-product=sql-wait-stats-joes-2-pros).
Jonathan Kehayias (blog | twitter), of SQLSkills.com, has a great eBook in short form for free and a longer, more comprehensive version for a small fee. (https://www.simple-talk.com/books/sql-books/troubleshooting-sql-server-a-guide-for-the-accidental-dba/) at Simple-Talk.
What are your favorite wait stat resources? Have you written a blog post that broadens or deepens our knowledge of wait stats in SQL Server? If so, I’d love for you to post a comment here with a link back to your article! Let me know what you think. Thanks,