SQLServerCentral Article

An Informal Look at Database Performance

,

One of the most common topics that comes up on sites like http://ask.sqlservercentral.com/ is how to improve the performance of an application or procedure that is running slowly. Sometimes the questioner posts their exact code and lots of details and this permits the community to provide an in depth analysis from multiple perspective. More often it is phrased very generally. After typing up a variation of the same advice I have given many times, I decided to compile the most common advice I give in one convenient place.

This is certainly not meant to be an exhaustive treatise on optimization; numerous large books have been written about that. Rather it is an informal, high level overview on the most common things to look at when dealing with general database performance issues. These are in no particular order.

Look at the T-SQL code

I always suggest looking at the code for two obvious reasons. One is that it is the area a developer will have the most direct control over. Most people asking a question about performance have control over the code (or at least the significant subsection thereof) but many will not have authority to change hardware, change schemas, or even change indexes without getting permission from someone else. The second reason is that the code really is often the main source of the problem.

This of course immediately brings up the question of what to look for and how to optimize it. And again, entire books have been written on optimization. But there are a couple of significant places to start.

RBAR and Procedural Processing

RBAR stands for "Row By Agonizing Row" and is a term coined by Jeff Moden for code that processes every row instead of dealing with it as a set. SQL Server was designed to work with sets and SQL Server works most efficiently when it can deal with sets. Obvious indicators of RBAR coding in SQL include cursors and while loops.

Loops and cursors should be avoided whenever possible. In SQL, if it can possibly be done in a set based fashion, it should be done in a set based fashion. The few loops I have not managed to move to set based code involve iterating over non-row objects such as files in the file system or indexes within the database.

While cursors and loops are obvious signs of RBAR and procedural code, it is possible to have code that looks like it is set based and still hides RBAR. Jeff Moden has articles about this at: Hidden RBAR: Triangular Joins (http://www.sqlservercentral.com/Authors/Articles/Jeff_Moden/80567/) and More RBAR and "Tuning" UPDATEs (http://www.sqlservercentral.com/articles/Performance+Tuning/62278/).

Code Doing Unnecessary Work

Of course, look for code doing unnecessary work. This may involve returning columns that are not used. It may involve calculating values that are not used. In some cases, especially with code that is older and has been modified many times, I have seen large blocks of code that are executed but then their results are simply discarded.

Examine the execution plan

Execution plans for queries or entire scripts can provide very good ideas of where most of the work is being done and where most of the optimization effort should focus, though they will certainly not be perfect especially when dynamic SQL is used. In particular, table scans are almost always a sign that something can be improved (in certain rare circumstances, mostly involving very small tables, a table scan actually can be more efficient than an index and the optimizer will correctly choose it), which brings us to:

Look at the Indexes

Indexes make read operations vastly faster at the expense of write operations. Since most databases do vastly more read operations than write operations, this is normally a good trade off. If any code in production is using table scans in particular, creating appropriate indexes can tremendously improve the performance.

There are a number of automated tools both from third parties and built into SQL Server to help a developer determine what indexes would be useful. One tool in particular worth looking at is the Missing Index Feature. In SQL Server 2008, if you show the execution plan for a simple query, it will automatically display the Missing Indexes. In SQL Server 2005 you can get this information from the dynamic management views, as detailed by Ranga Narasimhan in Missing Indexes in SQL Server 2005 (http://www.sqlservercentral.com/articles/Indexing/64134/) or by MSDN at (http://msdn.microsoft.com/en-us/library/ms345524.aspx). Of course, this is not perfect, as Rob Farley pointed out (http://msmvps.com/blogs/robfarley/archive/2008/10/12/missing-index-in-sql-server-2008-should-try-harder.aspx), but it is a very good place to start.

For truly complicated queries, you can consider an indexed view. Indexed views should be used carefully. Like all indexes, creating an indexed view will impact write performance, and there are some restrictions on what views can be indexed. Moreover, if you are using a version of SQL Server other than Enterprise you will need to explicitly instruct the optimizer to use the indexed view using the NoExpand hint. I previously wrote about this topic in On Indexes and Views (http://www.sqlservercentral.com/articles/Indexed+Views/63963/). With that said, under the right circumstances there are huge performance benefits available from proper uses of indexed views.

Even with good indexes in place, they will perform poorly if the indexes are highly fragmented. Exactly what constitutes highly fragmented is open to some debate and is somewhat dependant on the individual situation. But a very general rule of thumb is that if it is more than 30% fragmented then it should be rectified. It often makes sense to have basic defragmentation on some form of automated schedule to ensure that they never become highly fragmented. Robert Sheldon provides an excellent overview in Defragmenting Indexes in SQL Server 2005 and 2008 (http://www.simple-talk.com/sql/database-administration/defragmenting-indexes-in-sql-server-2005-and-2008/) and if you like PowerShell then Laerte Junior has an article about using PowerShell to handle a lot of the index maintenance at Exceptional PowerShell DBA Pt 3 - Collation and Fragmentation ( http://www.simple-talk.com/sql/database-administration/exceptional-powershell-dba-pt-3---collation-and-fragmentation/)

Look at the Statistics

Statistics are closely related to the indexes, and they are used by the query optimizer to help determine the execution plan. With default settings, SQL Server does a reasonably good job of maintaining the statistics it needs automatically, but they can become out of date for a variety of reasons even with the auto update enabled. If they are out of date, then they can be forced to update with the UPDATE STATISTICS

command. Depending on the details of your situation, it can make sense to have a job scheduled to run update statistics. There are a variety of opinions all from very knowledgeable experts as to if and how often update statistics should be explicitly called, but as a very rough rule of thumb there is benefit in reorganizing statistics if queries against that table that were fast have become slow or the data has change substantially (very large insert/update/delete with very large being defined relative to the size of the database). Also, as a very rough rule of thumb, when in doubt schedule an explicit update once a week.

Mike Gunderloy has an excellent primer on statistics at Basics of Statistics in SQL Server 2005 (http://www.developer.com/db/article.php/10920_3622881_2/Basics-of-Statistics-in-SQL-Server-2005.htm ) and Brad McGehee provides much more advanced hints and advice in his SQL Server UPDATE STATISTICS (http://www.sql-server-performance.com/tips/update_statistics_p1.aspx).

Look at Locking

If the script or procedure in question is blocked by a lock, it will come to a complete stop and wait patiently for the lock to be released. If this is happening frequently or for long periods of time, it can have a major impact on performance. Fortunately, there are a few ways of dealing with it.

The most direct and generally best way is to try to prevent it from happening in the first place. Try to minimize lock escalation, and make other code as efficient as possible so it holds locks for the least amount of time possible. When dealing with very long processes, especially update processes, you may want to break them into chunks so no one process monopolizes access to the tables. Article ID: 323630 (http://support.microsoft.com/kb/323630) in the Microsoft Knowledge Base provides some suggestions on minimizing blocking. For a general overview of the different types of locks and how they can affect the server, look at Introduction to Locking in SQL Server (http://www.sqlteam.com/article/introduction-to-locking-in-sql-server) by Mladen Prajdic.

The next method of dealing with locks is to modify the transaction isolation level. Changing the transaction isolation level should be done with care and forethought. The default isolation level is Read Committed and this is the right level the vast majority of the time. It should be moved from this only if there is a good reason to. But in cases where blocking is an extreme problem and cannot be practically resolved in other ways, then you can look at using snapshot isolation or using the Read Uncommitted isolation level which can be specified for a particular query using the NoLock query hint.

In a simplified form, snapshot isolation temporarily stores copies of data being changed in TempDB and permits new queries to work with that copy until the lock is released. This means that locks become minor issues, but it creates a high degree of overhead to maintain those copies and can require considerable storage in TempDB. Using NoLock query hints or otherwise invoking a Read Uncommitted isolation level risks dirty reads, which means you could return inconsistent data. NoLock also produces a chance of reading a row multiple times or skipping rows. Both Snapshot isolation and Read Uncommitted isolations should be used only with great care. But, if blocking is a major issue, the blocking problems cannot be resolved in other ways, then it may be worth the overhead for Snapshots or the risk of a dirty read for Read Uncommitted under the right circumstances. Ron Talmage provided a good look at the costs and benefits of Snapshots in his article Snapshot Isolation Adds Flexibility in SQL Server 2005 (http://www.devx.com/codemag/Article/21570/0/page/1), and Tim Chapman provides a brief look at the power and dangers of NoLock at Using NOLOCK and READPAST table hints in SQL Server (http://articles.techrepublic.com.com/5100-10878_11-6185492.html) .

Look at the Hardware

Sometimes the best way to fix a performance problem is to throw hardware at it. If the problem resides in poorly written code, blocking locks, or improper indexing then faster hardware will only mask the problem temporarily at best. But, if the demands on the database have grown vastly from when it was created, or if the hardware was underpowered to start with then it makes sense to upgrade it.

As a general rule, the first place to upgrade a SQL Server is the RAM. A busy SQL Server can consume tremendous amounts of RAM with the caching it does and the datasets it may be called to deal with. Naturally, the processing power, storage system, and network connection of the server all play a role in the responsiveness of SQL Server and if any of them are being taxed to the limit then upgrading that component will help speed up the system. In a busy server with numerous databases, it may also be possible to move some databases off to another server to reduce the overall load on the machine.

Maintain a performance baseline

Over the long term, it helps to maintain a performance baseline. This is valuable in seeing if there is a performance decline and if maintained in enough detail can help determine where the problem lies. It can also help determine whether or not the problem really is with the SQL Server instance, which brings us to:

Make sure the problem is with SQL Server

I have seen and heard of many cases where the problem that was being blamed on SQL was not in the SQL Server at all. The application layer also needs to be efficient and working on hardware that is sufficient for its need. This is particularly a concern when it is a web or browser based application that is relying on a web server as well.

Clearly, if SQL Server is not the bottleneck, then all the optimization possible and the fastest hardware available for the SQL Server would not help the final issue.

Get another set of eyes

I was led to write this by a series of very general performance questions on ask.sqlservercentral.com and there is nothing wrong with asking for another set of eyes, either within your team or (after ensuring you are not revealing proprietary or sensitive information) to the larger community of SQL Server users. If you do post on ask.sqlservercentral.com you can expect expert advice from brilliant people like Grant Fritchey, Matt Whitfield, Kev Riley, and many others in the community.

But, you will generally get better answers if you can provide specific, detailed questions than asking about performance in general. Also, Jeff Moden provided some very useful advice in Forum Etiquette: How to post data/code on a forum to get the best help (http://www.sqlservercentral.com/articles/Best+Practices/61537/).

Good Articles on Performance

Finding the Causes of Poor Performance in SQL Server, Part 1 by Gail Shaw (http://www.simple-talk.com/sql/performance/finding-the-causes-of-poor-performance-in-sql-server,-part-1/)

Finding the causes of poor performance in SQL Server, Part 2 by Gail Shaw (http://www.simple-talk.com/sql/performance/finding-the-causes-of-poor-performance-in-sql-server,-part-2/)

Performance Tuning by Brent Ozar (http://www.brentozar.com/sql/sql-server-performance-tuning/)

Troubleshooting Performance Problems in SQL Server 2005 in MSDN (http://www.brentozar.com/sql/sql-server-performance-tuning/)

Rate

4.53 (51)

You rated this post out of 5. Change rating

Share

Share

Rate

4.53 (51)

You rated this post out of 5. Change rating