RE: Worst Practices - Not Using Primary Keys and Clustered Indexes

SSC Enthusiast

Points: 181

August 1, 2002 at 9:35 am

More about clustered keys.

Andy is right about a lot of things in this article. Ironically, a few of his mistakes are not the result of being overzealous, but of giving the opposing argument too much slack!

For example, Andy says that some database designers will argue that a table doesn't need a primary key because "it's a temp table, so the rules don't really apply." He writes, "I think there is some validity to this position. [...] the most compelling reason [...] is that in many places where you use a temp table now you could (and probably should) use a variable of type table (available beginning in SQL 2000) which only allows you to define the structure, not indexes."

But you *can* specify a primary key (with a clustered index, if you like) on a table variable:

DECLARE @temp TABLE (

a INTEGER,

b NVARCHAR(2000),

PRIMARY KEY CLUSTERED(a)

)

Another argument Andy should dismiss, but doesn't, is that a particular type of table doesn't need a clustered index. He writes: "A good example of this might be a lookup table that contains state name and state abbreviation. You're going to be doing single record lookups, never a range query which is where a clustered index excels. You're also not going to have a lot of indexes, probably one on the name and one on the abbreviation where the index will provide the data, no need to do a seek through the table to find other bits of information associated with that row. For this example, is it a fair point? I think so."

Well, I don't think it matters because SQL Server will probably just keep the small table in memory. But let's look more closely at this.

Suppose you don't use a clustered index, and you then query against your "states" table for a particular state code. SQL Server has two options: (1) use the index, in which case it has to load two pages of data from the disk - a data page and an index page; or (2) don't use the index, load one page of data, but then scan the whole table to find the right row. It's a trade-off.

When you use a clustered index, the index information is built into the table data itself. So there's no trade-off. SQL Server only has to load one page from disk, and that page contains both the table data and the index data. A table with a clustered key could therefore actually perform faster.

[Plug: I verified some of this using Lumigent's Log Explorer tool. I created one table with a clustered index and one with a non-clustered index. When I insert rows into the clustered table, SQL Server records a single write in the transaction log. When I insert a row into the other table, SQL Server records two writes on different pages: one for the table itself, and one for the non-clustered index. Log Explorer let me view these records in the log. Note - I work for Lumigent, which advertises on this site.]

As a side note, the Books Online specifically recommend clustered indices for "[c]olumns that contain a limited number of unique values, such as a state column that contains only 50 unique state codes." But I can't tell if they are talking about a small "states" table or a large table that happens to have a "state" column.