Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Introduction to Indexes: Part 2 – The clustered index Expand / Collapse
Author
Message
Posted Wednesday, November 11, 2009 8:04 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, November 23, 2009 1:40 AM
Points: 13, Visits: 18
Hi,

Points taken - "hard to believe" (whole clustered index value in non clustered index) - I'm surprised that non clustered indexes don't also use the RID to provide a more efficient use of space.

I know space isn't so much of a problem nowadays however disk reads/writes are still (generally) the most time consuming function that a PC etc performs, so reducing the physical data storage size can overall, increase the data density and reduce the amount of I/O operations required, particularly head movement.

This is of course countered by the need to update all non clustered indexes when a row is moved to a different page as the result of a split operation but this (as you stated in your article) should be minimised by the clustered key design (i.e. stable, non changing).

"Article suggest a full able scan required".. my sincere apologies, I have re-read the para and you didn't make that suggestion - I mis-read the content of the SCAN method of accessing a table....

Sorry if I misled anybody or gave the impression that I thought the article was inaccurate, this wasn't my intent.

I will be reading part 3 when I get the opportunity....

Jim


Trainee Novice
Post #817182
Posted Wednesday, November 11, 2009 8:06 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, February 13, 2014 5:47 AM
Points: 1, Visits: 15
Very nice overview. Gail, one thing that I was wondering if you could comment on... Towards the end there was a discussion of the best type of clustering key to use. I totally understand you're providing some very good general rules, but one of the attributes was that it is unique. Often times a very important use of a clustered index is to organize data in the detail table of a summary-detail table pair, for example, orders and order items. The most common query against the detail table will be to find those rows that correspond to a given summary row (ethier when looking up a specific order, or aggregating detail rows for all or a group of summary rows). In this case it is terribly important to use the one and only clustered index in the detail table to be on the foreign key linking back to the summary table, I think? I'm pretty sure that's generally the right way to do things so that range queries on the detail table (again, assuming they're the most common queries on the detail table).

Anyways, I just thought that since summary-detail table relationships are very common that it's probably worth calling out this exception to the uniqueness guideline specifically. What do you think?
Post #817187
Posted Wednesday, November 11, 2009 8:17 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 2:01 PM
Points: 40,404, Visits: 36,849
TheSQLGuru (11/11/2009)
I am often-times disappointed that I cannot have more than one clustered index. Do you think they will ever put this most useful feature in SQL Server Gail??


Were you in my session on index DMVs?

CREATE INDEX ON SomeTable (KeyColumn) INCLUDE (<All other columns in table)

Voilà, a second 'clustered' index.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #817201
Posted Wednesday, November 11, 2009 8:24 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 2:01 PM
Points: 40,404, Visits: 36,849
Esalter (11/11/2009)
I'm surprised that non clustered indexes don't also use the RID to provide a more efficient use of space.


A RID is 8 bytes. An integer column defined as identity (a very good and popular choice for clustered index) is 4 bytes.

This is of course countered by the need to update all non clustered indexes when a row is moved to a different page as the result of a split operation


And the need to rebuild every single nonclustered index when the cluster is rebuilt. Painful in terms of logging.

The need to update all the nonclustered indexes is, I believe, one of the reasons the RID is not used as the row 'pointer' when there's a clustered index. Even when the underlying table is a heap, when a row is moved the RID in the NC indexes is not updated, rather a forwarding pointer is left behind which says where the row has moved to.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #817208
Posted Wednesday, November 11, 2009 8:25 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, November 23, 2009 1:40 AM
Points: 13, Visits: 18
Hi again,
Sadly we're still on SQL2000 so I'm reading these articles for interest (just starting to move stuff to SQL2005) however, one thing I've frequently come across that would be extremely useful is the ability to define Foreign keys that are a compound key (concatenated columns). Do you know if this is available in current versions of SQL (or ever likely to be)?

Jim


Trainee Novice
Post #817210
Posted Wednesday, November 11, 2009 8:32 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 2:01 PM
Points: 40,404, Visits: 36,849
Brian.Eriksen (11/11/2009)
Anyways, I just thought that since summary-detail table relationships are very common that it's probably worth calling out this exception to the uniqueness guideline specifically. What do you think?


Those are guidelines, not rules. If you feel that ignoring them is for the better, then go ahead. Of those guidelines, the uniqueness one is the one I most often ignore.

As I stated, I hold to the school that says the cluster is used to organise the table and the nonclustered indexes to retrieve data. I've done the summary/detail design before, often I use nonclustered indexes on the foreign key column. Other times I've used the cluster.

...
...
...

It depends.

If you've tested and that is the best way to organise things and the fragmentation (if the cluster is not ascending) and increased width of the cluster is not an issue, go right ahead. Just be aware of the tradeoffs



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #817219
Posted Wednesday, November 11, 2009 8:38 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, November 23, 2009 1:40 AM
Points: 13, Visits: 18
Hi Gail,
(I promise I'll give up soon) - Am I missing something (as some colleagues have already suggested).
If an identity column is used for the clustered index (CI) and NC indexes use the identity column (key) rather than a RID as the retrieval pointer then doesn't that mean that when the NC index is searched and the key (Identity column) is retrieved, it then needs to traverse the CI tree down to the leaf level to retrieve the actual row required as the 'identity column' alone wouldn't physically locate the row, it's just part of the data (leaf level of the CI)?
Jim


Trainee Novice
Post #817228
Posted Wednesday, November 11, 2009 8:43 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 2:01 PM
Points: 40,404, Visits: 36,849
Esalter (11/11/2009)
one thing I've frequently come across that would be extremely useful is the ability to define Foreign keys that are a compound key (concatenated columns). Do you know if this is available in current versions of SQL (or ever likely to be)?


Multi-column Foreign keys? That's been possible for many versions.
Create table Parent (
ParentID int identity primary key,
ChildID1 int,
ChildID2 int,
SomeArbString varchar(10)
)

Create table Child (
ChildID1 int not null,
ChildID2 int not null,
SomeDate datetime
)

ALTER TABLE Child ADD CONSTRAINT pk_Child PRIMARY KEY (ChildID1, ChildID2)

ALTER TABLE Parent ADD CONSTRAINT fk_Parent_Child FOREIGN KEY (ChildID1, ChildID2) REFERENCES Child (ChildID1, ChildID2)

Got nothing to do with indexes though



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #817231
Posted Wednesday, November 11, 2009 8:44 AM
Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Monday, November 10, 2014 9:36 AM
Points: 1,414, Visits: 4,544
GilaMonster (11/11/2009)
Esalter (11/11/2009)
I'm surprised that non clustered indexes don't also use the RID to provide a more efficient use of space.


A RID is 8 bytes. An integer column defined as identity (a very good and popular choice for clustered index) is 4 bytes.

This is of course countered by the need to update all non clustered indexes when a row is moved to a different page as the result of a split operation


And the need to rebuild every single nonclustered index when the cluster is rebuilt. Painful in terms of logging.

The need to update all the nonclustered indexes is, I believe, one of the reasons the RID is not used as the row 'pointer' when there's a clustered index. Even when the underlying table is a heap, when a row is moved the RID in the NC indexes is not updated, rather a forwarding pointer is left behind which says where the row has moved to.


i don't think i ever found a straight answer to this question

if i have a table with a clustered index and say 5 NC indexes

i run alter index <clustered_index> on <some_table> rebuild

will this also rebuild all the non-clustered indexes as well? i've noticed that the clustered index rebuilds take a lot longer to complete


https://plus.google.com/100125998302068852885/posts?hl=en
http://twitter.com/alent1234
x-box live gamertag: i am null
[url=http://live.xbox.com/en-US/MyXbox/Profile[/url]
Post #817232
Posted Wednesday, November 11, 2009 8:46 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 2:01 PM
Points: 40,404, Visits: 36,849
Esalter (11/11/2009)

If an identity column is used for the clustered index (CI) and NC indexes use the identity column (key) rather than a RID as the retrieval pointer then doesn't that mean that when the NC index is searched and the key (Identity column) is retrieved, it then needs to traverse the CI tree down to the leaf level to retrieve the actual row required as the 'identity column' alone wouldn't physically locate the row, it's just part of the data (leaf level of the CI)?


The identity column would identify the row (it's the clustering key). To get the rest of the row, SQL would do a lookup to the cluster. That's called a bookmark lookup in SQL 2000 and a key lookup in SQL 2005 and above. The physical location of the row is totally irrelevance.

That's discussed in this article. The section titled 'lookups'



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #817234
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse