Blog Post

T-SQL Tuesday #10 - Indexes


TSQL2sDay150x150[1] It’s time for T-SQL Tuesday again, the brainchild of Adam Machanic (Blog|Twitter), and this month’s topic is Indexes. A thanks to Michael Swart for hosting this month’s event, and sending me a personal invite through email. That was pretty cool, especially since I don’t necessarily remember when the event is kicking off.

The invitation is here, and time is running out to participate, but if you want to blog, you can do it today before this script (from Michael) tells you that you can’t

IF GETUTCDATE() BETWEEN '20100914' AND '20100915'
SELECT 'You Can Post'
'Not Time To Post'

Remembering to Index

Indexing seems to be one of those things that so many people forget to do. As a result, adding a few indexes is often the easiest win, the lowest hanging fruit, the most effective way of improving performance in many situations.

I know in the past that I almost always create a single index on a new table, the PK, when I build it. If there are FKs, those are setup as well, and this usually helps, but when building a database, I sometimes forget to add other indexes since I don’t know what other queries I might be running when I start. And as I start to build those queries (to support new functions for the user), it’s easy to forget to add an index.

sqldatagenerator[1] Since it’s hard to guess what you will query often, and the think/build/test/refactor model leads you down dead ends at times, I don’t typically build an index for a new query. I don’t notice performance issues since I’m usually working with test data that has dozens of rows, not hundreds. In the past, I’ve wished for something like Data Generator. Now that I have it, I typically don’t need it anymore.

I try to go back and add in indexes as we get close to deployment, but I’ll admit that I typically have found in the production system that I’m missing indexes. This usually occurs over time, as data grows and performance decreases. I’ve wished that I had a better method for handling indexes, and apparently Microsoft was listening to wishes since we now have sys.dm_db_missing_index_details and other missing index DMVs.

However I found someone that did have a solution. The company I worked for bought a third party piece of software to handle some specialized function for a department. I wasn’t involved, but one day they came to me with a performance issue. As I dug into the application, which had been working for months, I noticed something interesting.

Every column was indexed.

Not only was every column indexed, they had added other indexes that reversed columns in the index, so if I had a table like this one:

CustomerID INT
, FirstName VARCHAR(50)
LastName VARCHAR(50)
City VARCHAR(50)
, Country VARCHAR(50)
, Notes varchar(max)

I might have these indexes:

  • CI : CustomerID
  • NCI: FirstName
  • NCI: LastName
  • NCI: FirstName, Lastname
  • NCI: LastName, Firstname
  • NCI: Address1
  • NCI: City
  • NCI: StateID
  • NCI: Country
  • NCI: Notes
  • NCI: Address1, city, stateid, country

It seemed like a little overkill, and after discussing this with the support engineer for awhile, he admitted that a number of these indexes weren’t needed, but the developer didn’t know what to index, so they indexed everything.

We decided to run a trace on the application for a week, sort through queries, and come up with some more rational choices for indexes. We also removed a few of the duplicate indexes, and we improved performance for a number of functions.

Indexing is important, but you can overdo it. Make sure that you index, but also think about what you index, and don’t index everything.