SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Sneak Preview: Start to Finish Guide to SQL Server Performance Monitor

By Brian Kelley,

Sneak Preview: Start to Finish Guide to SQL Server Performance Monitoring

One of the nice things about a book store is the ability to go, grab a book, and look through it. Often before I purchase a book, I'll do just that. If I want to order a book on-line because of a deal and I can subdue that "must have it NOW" impulse in myself, I'll still make a drive down to my favorite bookstores and see if I can find it on the shelf. This is much more difficult with an e-Book.

To try and give more of that "browse the bookshelf" experience, I've prepared some excerpts of the e-Book Start to Finish Guide to SQL Server Performance Monitoring. I've made selections of several key sections of the book. Here are the excerpts I've chosen:

If you are interested in purchasing the e-Book through SQL Server Central, please visit the NetImpress page for SQLServerCentral.com customers.

Excerpt 1: From Performance Counters

SQL Server Counters

SQL Server Counters Quick Reference

SQLServer:Buffer Manager\Buffer Cache Hit Ratio

% of data pages found in memory

SQLServer:Cache Manager\Cache Hit Ratio

Ratio of execution cache hits versus total number of lookups

SQLServer:General Statistics\User Connections

Number of currently connected users

SQLServer:Latches\Average Latch Wait Time (ms)

Average time to obtain a latch

SQLServer:Latches\Latch Waits/sec

Number of latches established per second

SQLServer:Locks\Average Wait Time (ms)

Average time to obtain a lock

SQLServer:Locks\Lock Timeouts/sec

Average # of lock attempts expiring without a lock being established

SQLServer:Locks\Number of Deadlocks/sec

Average # of deadlocks occurring per second

SQLServer:Memory Manager/Total Server Memory

Amount of memory being utilized by SQL Server

SQLServer:SQL Statistics\SQL Re-Compilations/sec

Average # of execution plans being recompiled per second

Just as the hardware has counters, so does SQL Server. There are specific counters we use in conjunction with the hardware counters to determine how well SQL Server is performing on a given system. We also have to realize that the way SQL Server operates is fundamentally different from a file server, web server, or domain controller. So there may be times when we’ll see processor utilization near 100 % for a given time period (for instance, a particularly complex DTS package which does our data warehousing) and we have to take those things into account.

SQLServer:Buffer Manager\Buffer Cache Hit Ratio

Threshold: < 90 %

The Buffer Cache Hit Ratio represents the percentage of data pages found in memory. When SQL Server finds the data in memory, it doesn’t have to go to the physical disk to retrieve the data. Since retrieving data from a hard disk is significantly slower than retrieving memory from physical memory, we want to pull data from memory as often as possible. As a result, we’re looking for as close to 100% for a Buffer Cache Hit Ratio as possible. The threshold is therefore understandably high. If we start dipping below 90 %, it’s time to consider extra RAM. Others feel in an OLTP environment this number should be greater than 95%. Still others are looking for 98%. I like to consider 90% a minimum.

SQLServer:Cache Manager\Cache Hit Ratio

Threshold: Varies (<90% for in-house developed applications)

This corresponds to the ratio between cache hits of execution plans versus lookups. We want this number as close to 100% regardless of the instance, but I’ve not seen any hard and fast numbers for a threshold. For applications we have built the data access layer for, we look for numbers around 90 %. Often, for third-party applications there is little we can do, as the data access layer is hard-coded/not subject to modification.

There are two reasons why the number could be low:

  • An execution plan doesn’t exist
  • The execution plan couldn’t be located on the first pass

The first possibility can occur if we’re short on memory or if our stored procedures, triggers, and ad hoc queries require recompiling, making an execution plan of little use. The second tends to occur because either we’ve not specified an owner and the user isn’t dbo or we’ve used the sp_ prefix and the stored procedure is in a user database. I’ll discuss more on these issues later.

Excerpt 2: A Real World Example

In The Real World

A real-world example I can cite occurred towards the end of a business day (as always)and we were all looking to get home. An application group was working late on changes to a system written within the organization. A development server hosting SQL Server suddenly became very slow. Queries normally running in less than a second were taking a minute or more. Immediately the DBA team popped up System Monitor and took a look at the processor and memory counters as well as the ones in SQL Server for locks and user connections. What we saw was the number of user connections was average for the development server (about 15-17) but lock waits were higher than normal. We also saw the memory counters showed no problems but processor utilization was near 100 %.

Since the development server hosted a couple of other third-party applications that provided additional functionality to the application, we wondered if one of them was the culprit. Keep in mind that during this time the application team was screaming for the heads of the DBAs for allowing SQL Server to get in such bad shape. We took a look at the cache hit ratio and the buffer cache hit ratio. Both were normal.

We tried to get to SQL Server using Query Analyzer and making the connection took considerably longer than normal. Executing a the sp_who2 system stored procedure to see what activity was going on in SQL Server, we saw there was very little… though it took nearly two minutes for the system stored procedure to return this information. At this point, we had enough evidence to say it most likely was not SQL Server. But we needed one more counter to confirm what we believed. A counter object we haven’t talked about is Process. This object has an instance for every process running on the system. We added the instance for sqlservr (the name of the process for SQL Server) for % processor utilization and saw it was low (less than 10 %). This vindicated the DBA team.

We were off the hook, but the problem still existed, so we continued to investigate by checking out all the processes. Sure enough, one of the third-party applications was continually registering > 95 %. We used a tool to kill the process remotely and watched the processor utilization for the entire system drop immediately. Suddenly the queries started resolving in the expected times. Problem solved! After a quick write-up, we headed home.

Excerpt 3: Using Profiler to Find Deadlocks and Lock Timeouts


Acquiring and releasing locks is a typical part of any multi-user database system and SQL Server is no exception. SQL Server allows us to track locks being acquired and released in events, but in a SQL Server installation with any kind of concurrent usage, these events will occur frequent enough to be of relatively little value. There are other events, however, that are of great use.

  • Lock:Deadlock
  • Lock:Deadlock Chain
  • Lock:Timeout

The first two, Lock:Deadlock and Lock:Deadlock Chain, are very useful if we’re trying to track down what statements are causing deadlocking issues. While it is well-understood database tables and views should always be accessed in the same order, in larger environments this rule can be violated quite easily when multiple developers are working to create a usable system. By turning on these events, we can report these conflicts and get the information back to our development teams.

To create a deadlock event on your own, open two connections in Query Analyzer to a server. For our example, we’ll use the Northwind database. In one connection, we’ll enter the following code:

UPDATE Customers
SET CompanyName = 'Doe, Inc'
-- SELECT * FROM Employees

In the other connection, we’ll place the following code:

UPDATE Employees
SET LastName = 'Doe'
-- SELECT * FROM Customers

The idea here is we’re going to have one connection create an exclusive table lock on the Customers table. The other connection will create an exclusive table lock on the Employees table. Then the first connection will try and select from the Employees table. Since the second connection has an exclusive table lock and the transaction isn’t committed, it’ll be blocked. Then we’ll have the second connection try and select from the Customers table. It too will get blocked and SQL Server will realize that we’ve run into a deadlock situation. Let’s step through each step. First we highlight and execute just BEGIN TRAN for both connections, as in figure 54.

Figure 54: Executing BEGIN TRAN on Both Connections

Next we’ll execute both UPDATE statements to create the exclusive table locks (Figure 55).

Figure 55: Update Statements Forcing Table Locks

The final step to is to execute the SELECT statements. This will force the deadlock and one of the connections will have its operation terminated and the transaction will automatically rolled back (Figure 56). The other connection, since the table lock is removed, will be able to complete execution of the SELECT statement. As a matter of cleanup, the ROLLBACK TRAN statement will need to be executed on the connection (the first connection in figure 56) whose SELECT statement completed.

Figure 56: SELECT Statements forcing the Deadlock

We’ll also see the Lock:Deadlock and Lock:Deadlock Chain events occur in our Profiler trace in Figure 57. In this particular example, I’ve left the Exception event in the events to track. As a result, we see it showing up as well.

Figure 57: Deadlocking Recorded in Profiler

Notice the Lock:Deadlock event corresponds to the connection (SPID) chosen as the deadlock victim. In the case of this example, the SPID is 51 and we can scroll back through our Profiler trace to see the T-SQL statements leading to a deadlock. We also have the statements executing around the same time as our deadlock culprit. In this particular case, SPID 51 was trying to do a SELECT * FROM Customers when the deadlock occurred. We would take a closer look at SPID 51’s commands and compare the two connections.

The last event, Lock:Timeout, isn’t as useful because by default, SQL Server doesn’t register a lock timeout unless the client application sets one. A program can specify a lock timeout by SET LOCK_TIMEOUT. Otherwise, SQL Server and the application will wait indefinitely for the locks to clear. If we do set a lock timeout and the threshold is reached, we’ll see the event in the Profiler trace shown in Figure 58.

Figure 58: Lock Timeout

Concluding Remarks

Hopefully these three excerpts have whetted your appetite for the Start to Finish Guide to SQL Server Performance Monitoring. There's a lot more in the book, including how to capture the data, how to transform it into a readable format, and how to use the data to spot potential issues ahead of time. I also cover a step-by-step methodology on performance monitoring, my attempt at trying to put into a "science" what some consider an "art." The last thing I'll leave you with is a condensed Table of Contents:

Condensed Table of Contents:

  • Performance Monitoring Tools
  • A Straight Forward Methodology
  • Performance Logs and System Monitor
    • Monitoring Locally vs. Remotely
    • Real-Time versus "Unattended" Monitoring
    • Setting Up a Performance Log
    • Choosing the Right Counters
    • Spot Checking in an Emergency
    • Baselines for the Long Haul
    • Responding to Trends
  • SQL Profiler
    • Creating a Profiler Trace
    • Typical Events
    • Stored Procedures, Caching, and Recompiles
    • Converting Trace Files
  • Putting it All Together
Total article views: 21259 | Views in the last 30 days: 14
Related Articles

Retrieving Deadlock Graphs with SQL Server 2008 Extended Events

Learn how to get deadlock graphs from the SQL Server 2008 default system_health session with Extende...


How to save deadlock graph events as .xdl file in SQL Server ?

How to save deadlock graph events as .xdl file ? On the File menu, click New Trace, and then conn...



Deadlock trace






How many deadlocks is normal?