SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Common SQL Server Mistakes – Indexing Every Column

If one index helps speed up queries, than more indexes should help more, right? They do, but they also come at a price. Both in performance during data modifications (insert/update/delete), and in terms of space since each index must be stored somewhere.

I have never bothered to index every column of a table. Actually I’m not sure if I’ve indexed every column of any table. Perhaps that’s because I originally came from a dBase/Clipper/Foxpro environment and I had to manage every index manually. Adding too many indexes resulted in a lot of coding.

In SQL Server, each index is useful in two ways. When someone puts a filter in the WHERE clause, like this:

  CustomerID, TerritoryID, AccountNumber
from Sales.Customer
where CustomerID = 10

then an index on CustomerID will speed up this query. Instead of having to scan all rows of the table, the index on CustomerID can be searched, the correct row found. If this is a clustered index, then the data can be read. If it’s a non-clustered index, the server can retrieve the rowID and then go get the data from the clustered index without reading all the rows in the table.

In more recent versions of SQL Server, multiple indexes can be used. For example,

  CustomerID, TerritoryID, AccountNumber
from Sales.Customer
where CustomerID = 10
  and TerritoryID = 23

In AdventureWorks, there are indexes on both CustomerID and TerritoryID for this table. It is possible (with lots of data), that the optimizer might choose to scan the CustomerID index for all matching rows and then the TerritoryID index for matching rows, join those results together to get a set of rows for the overall query and read the clustered index for those specific rows. However that’s not something you can count on in SQL Server. Typically one index is used in many queries.

So why not index every column?

First, if the columns are large, like varchar(max), text, or varbinary(max) columns, then it doesn’t make sense to build large indexes unless you often query these fields. Even then, a full-text index is likely a better choice.

Second, each time you change data (insert/update/delete), then all indexes must be updated at the same time. This means that your write performance suffers, and that impacts the server read performance as well since resources are being used to perform those updates. The more indexes, the more work that has to be done in support of any DML statement.

Lastly, you typically find that most of the time you query a table based on 3-5 fields, and those are the best candidates for indexes. For transactional tables, this is typically the number of indexes that you want to put on each table. Reporting tables, or OLAP type tables, might have more indexes, but these are tables that typically receive mostly read activity, and rare write activity.

Which columns do you index? Pick those columns that often appear in your queries, and that are fairly selective. You can always query the missing index DMVs for help in choosing which indexes the optimizer things it might use.

The Voice of the DBA

Steve Jones is the editor of SQLServerCentral.com and visits a wide variety of data related topics in his daily editorial. Steve has spent years working as a DBA and general purpose Windows administrator, primarily working with SQL Server since it was ported from Sybase in 1990. You can follow Steve on Twitter at twitter.com/way0utwest


Posted by Chris Harshman on 29 September 2010

This reminds me of when I used to work with Siebel CRM, I couldn't believe how many indexes they created, and that's an OLTP system.

I've frequently avoided the "missing index" suggestions though, it can sometimes suggest excessive indexes or indexes close to an existing one.  I've seen it also suggest creating indexes that I believe would have had poor selectivity on them.

Posted by Steve Jones on 29 September 2010

You definitely have to examine the "missing indexes" with a grain of salt. It suggests any that it could use, which can lead to too many indexes.

Leave a Comment

Please register or log in to leave a comment.