Snargables (8/22/2014)
Recently I got into a discussion w/ a coworker over the primary key of a fact table. He wanted to put the primary key on the identity and I suggested putting the primary key on the unique columns which in this instance there are two and they are ints.Here’s his logic.
Cluster index the surrogate key(PK) and then add non-clustered indices to the table. A smaller primary key leads to smaller non-clustered indices and faster performance.
To me this doesn’t seem right. Why would u cluster an id unless u were going to use it in your queries?
Expanding on all the good points Sean has made, consider this: what if you needed six or even ten columns of your table in order to uniquely identify a row? And to make it even more awkward, one or more of those columns are nullable 😛
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden