Great Article. As someone who started out using packed data on old IBM systems to save room, even I can get lazy about selecting the right column type.
Can someone clarify one sentence in the article:
So you need to keep in mind things like (for SQL Server at least) NonClustered indexes typically contain the data from the Clustered index (exceptions are: Unique Indexes and NonClustered Indexes that contain the Cluster Key)
I just reviewed the structures of non clustered indexes (http://msdn.microsoft.com/en-us/library/ms177484.aspx) and from what I see, it does not store the clustered index data. Instead of a RID (used when the table is a heap), a NonClustered index points to the clustered index key.
From same article as above.
If the table has a clustered index, or the index is on an indexed view, the row locator is the clustered index key for the row. If the clustered index is not a unique index, SQL Server makes any duplicate keys unique by adding an internally generated value called a uniqueifier. This four-byte value is not visible to users. It is only added when required to make the clustered key unique for use in nonclustered indexes. SQL Server retrieves the data row by searching the clustered index using the clustered index key stored in the leaf row of the nonclustered index.
Can someone show me how a NonClustered index size (disk cost) is affected by the Clusterd Index Size?