SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Introduction to Indexes: Part 2 – The clustered index


Introduction to Indexes: Part 2 – The clustered index

Author
Message
Esalter
Esalter
SSC Rookie
SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)

Group: General Forum Members
Points: 37 Visits: 18
Hi,

Points taken - "hard to believe" (whole clustered index value in non clustered index) - I'm surprised that non clustered indexes don't also use the RID to provide a more efficient use of space.

I know space isn't so much of a problem nowadays however disk reads/writes are still (generally) the most time consuming function that a PC etc performs, so reducing the physical data storage size can overall, increase the data density and reduce the amount of I/O operations required, particularly head movement.

This is of course countered by the need to update all non clustered indexes when a row is moved to a different page as the result of a split operation but this (as you stated in your article) should be minimised by the clustered key design (i.e. stable, non changing).

"Article suggest a full able scan required".. my sincere apologies, I have re-read the para and you didn't make that suggestion - I mis-read the content of the SCAN method of accessing a table....

Sorry if I misled anybody or gave the impression that I thought the article was inaccurate, this wasn't my intent.

I will be reading part 3 when I get the opportunity....

Jim

Trainee Novicew00t
Brian.Eriksen
Brian.Eriksen
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 21
Very nice overview. Gail, one thing that I was wondering if you could comment on... Towards the end there was a discussion of the best type of clustering key to use. I totally understand you're providing some very good general rules, but one of the attributes was that it is unique. Often times a very important use of a clustered index is to organize data in the detail table of a summary-detail table pair, for example, orders and order items. The most common query against the detail table will be to find those rows that correspond to a given summary row (ethier when looking up a specific order, or aggregating detail rows for all or a group of summary rows). In this case it is terribly important to use the one and only clustered index in the detail table to be on the foreign key linking back to the summary table, I think? I'm pretty sure that's generally the right way to do things so that range queries on the detail table (again, assuming they're the most common queries on the detail table).

Anyways, I just thought that since summary-detail table relationships are very common that it's probably worth calling out this exception to the uniqueness guideline specifically. What do you think?
GilaMonster
GilaMonster
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86288 Visits: 45232
TheSQLGuru (11/11/2009)
I am often-times disappointed that I cannot have more than one clustered index. Do you think they will ever put this most useful feature in SQL Server Gail?? Hehe


:-D Were you in my session on index DMVs?

CREATE INDEX ON SomeTable (KeyColumn) INCLUDE (<All other columns in table)

Voilà, a second 'clustered' index.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


GilaMonster
GilaMonster
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86288 Visits: 45232
Esalter (11/11/2009)
I'm surprised that non clustered indexes don't also use the RID to provide a more efficient use of space.


A RID is 8 bytes. An integer column defined as identity (a very good and popular choice for clustered index) is 4 bytes.

This is of course countered by the need to update all non clustered indexes when a row is moved to a different page as the result of a split operation


And the need to rebuild every single nonclustered index when the cluster is rebuilt. Painful in terms of logging.

The need to update all the nonclustered indexes is, I believe, one of the reasons the RID is not used as the row 'pointer' when there's a clustered index. Even when the underlying table is a heap, when a row is moved the RID in the NC indexes is not updated, rather a forwarding pointer is left behind which says where the row has moved to.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Esalter
Esalter
SSC Rookie
SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)

Group: General Forum Members
Points: 37 Visits: 18
Hi again,
Sadly we're still on SQL2000 so I'm reading these articles for interest (just starting to move stuff to SQL2005) however, one thing I've frequently come across that would be extremely useful is the ability to define Foreign keys that are a compound key (concatenated columns). Do you know if this is available in current versions of SQL (or ever likely to be)?

Jim

Trainee Novicew00t
GilaMonster
GilaMonster
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86288 Visits: 45232
Brian.Eriksen (11/11/2009)
Anyways, I just thought that since summary-detail table relationships are very common that it's probably worth calling out this exception to the uniqueness guideline specifically. What do you think?


Those are guidelines, not rules. If you feel that ignoring them is for the better, then go ahead. Of those guidelines, the uniqueness one is the one I most often ignore.

As I stated, I hold to the school that says the cluster is used to organise the table and the nonclustered indexes to retrieve data. I've done the summary/detail design before, often I use nonclustered indexes on the foreign key column. Other times I've used the cluster.

...
...
...

It depends. :-D

If you've tested and that is the best way to organise things and the fragmentation (if the cluster is not ascending) and increased width of the cluster is not an issue, go right ahead. Just be aware of the tradeoffs

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Esalter
Esalter
SSC Rookie
SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)SSC Rookie (37 reputation)

Group: General Forum Members
Points: 37 Visits: 18
Hi Gail,
(I promise I'll give up soon) - Am I missing something (as some colleagues have already suggested).
If an identity column is used for the clustered index (CI) and NC indexes use the identity column (key) rather than a RID as the retrieval pointer then doesn't that mean that when the NC index is searched and the key (Identity column) is retrieved, it then needs to traverse the CI tree down to the leaf level to retrieve the actual row required as the 'identity column' alone wouldn't physically locate the row, it's just part of the data (leaf level of the CI)?
Jim

Trainee Novicew00t
GilaMonster
GilaMonster
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86288 Visits: 45232
Esalter (11/11/2009)
one thing I've frequently come across that would be extremely useful is the ability to define Foreign keys that are a compound key (concatenated columns). Do you know if this is available in current versions of SQL (or ever likely to be)?


Multi-column Foreign keys? That's been possible for many versions.
Create table Parent (
ParentID int identity primary key,
ChildID1 int,
ChildID2 int,
SomeArbString varchar(10)
)

Create table Child (
ChildID1 int not null,
ChildID2 int not null,
SomeDate datetime
)

ALTER TABLE Child ADD CONSTRAINT pk_Child PRIMARY KEY (ChildID1, ChildID2)

ALTER TABLE Parent ADD CONSTRAINT fk_Parent_Child FOREIGN KEY (ChildID1, ChildID2) REFERENCES Child (ChildID1, ChildID2)



Got nothing to do with indexes though

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


alen teplitsky
alen teplitsky
SSCrazy
SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)

Group: General Forum Members
Points: 2702 Visits: 4666
GilaMonster (11/11/2009)
Esalter (11/11/2009)
I'm surprised that non clustered indexes don't also use the RID to provide a more efficient use of space.


A RID is 8 bytes. An integer column defined as identity (a very good and popular choice for clustered index) is 4 bytes.

This is of course countered by the need to update all non clustered indexes when a row is moved to a different page as the result of a split operation


And the need to rebuild every single nonclustered index when the cluster is rebuilt. Painful in terms of logging.

The need to update all the nonclustered indexes is, I believe, one of the reasons the RID is not used as the row 'pointer' when there's a clustered index. Even when the underlying table is a heap, when a row is moved the RID in the NC indexes is not updated, rather a forwarding pointer is left behind which says where the row has moved to.


i don't think i ever found a straight answer to this question

if i have a table with a clustered index and say 5 NC indexes

i run alter index <clustered_index> on <some_table> rebuild

will this also rebuild all the non-clustered indexes as well? i've noticed that the clustered index rebuilds take a lot longer to complete
GilaMonster
GilaMonster
SSC Guru
SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)SSC Guru (86K reputation)

Group: General Forum Members
Points: 86288 Visits: 45232
Esalter (11/11/2009)

If an identity column is used for the clustered index (CI) and NC indexes use the identity column (key) rather than a RID as the retrieval pointer then doesn't that mean that when the NC index is searched and the key (Identity column) is retrieved, it then needs to traverse the CI tree down to the leaf level to retrieve the actual row required as the 'identity column' alone wouldn't physically locate the row, it's just part of the data (leaf level of the CI)?


The identity column would identify the row (it's the clustering key). To get the rest of the row, SQL would do a lookup to the cluster. That's called a bookmark lookup in SQL 2000 and a key lookup in SQL 2005 and above. The physical location of the row is totally irrelevance.

That's discussed in this article. The section titled 'lookups'

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search