Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Cluster That Index! Expand / Collapse
Author
Message
Posted Sunday, March 30, 2003 12:00 AM
UDP Broadcaster

UDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP Broadcaster

Group: General Forum Members
Last Login: Saturday, February 28, 2009 6:51 AM
Points: 1,489, Visits: 7
Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/chedgate/clusterthatindex.asp



--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/
Post #10961
Posted Sunday, March 30, 2003 3:46 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Thursday, October 21, 2004 9:17 PM
Points: 312, Visits: 1
Thanks for the great article.
Btw: the link to script 1 is broken, there is a space in '.t xt' and the URL for script2.txt does not appear to exist... probably because script2 seems to be included at the bottom of script1.txt



Edited by - virasol on 03/30/2003 3:49:51 PM

Edited by - virasol on 03/30/2003 3:51:00 PM



Post #57633
Posted Monday, March 31, 2003 12:00 AM
UDP Broadcaster

UDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP Broadcaster

Group: General Forum Members
Last Login: Saturday, February 28, 2009 6:51 AM
Points: 1,489, Visits: 7
quote:

Thanks for the great article.
Btw: the link to script 1 is broken, there is a space in '.t xt' and the URL for script2.txt does not appear to exist... probably because script2 seems to be included at the bottom of script1.txt



Thanks. I've notified Andy about the broken links, hopefully they'll be fixed later today.

--
Chris Hedgate @ Apptus Technologies (http://www.apptus.se)
http://www.sql.nu




--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/
Post #57634
Posted Monday, March 31, 2003 8:58 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 7:10 AM
Points: 2,555, Visits: 602
Christoffer,

I really enjoyed reading this article - it was very well written, everything explained in simple but comprehensive terms - by the end of it all I could almost hear the grunts and groans of a non-clustered index!
However, I do have one question about the suggestion of dropping an ID column as clustered so we can actually use a column that will come in more handy and make sense to use as clustered - generally speaking - indexes are known to improve performance - but at the same time, if we have too many indexes on a table, it would also slow it down as each index means more overheads on inserts, updates and deletes! How does this fit in with the clustered/non-clustered pros and cons ?!









**ASCII stupid question, get a stupid ANSI !!!**
Post #57635
Posted Monday, March 31, 2003 11:02 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Tuesday, April 21, 2009 10:33 AM
Points: 50, Visits: 10
Great article. One disadvantage to a clustered index that I hear a lot of people claim is disk space. How much disk space does a clustered index actually take up compared to the table it is indexing?


Post #57636
Posted Tuesday, April 1, 2003 1:45 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, October 10, 2005 4:49 AM
Points: 16, Visits: 1
A very useful article. I would just like to comment on the area of Page splits that was discussed. To get around the page split problem to a degree use the FILL FACTOR and PAD INDEX to set the initial level of data in the pages or when rebuilding. Levels are from 0-100%. Depending on whether the table is heavy on reads or writes as to the% of the page that is initially filled.

I was unaware that heaps could not be fragmented and the explanation on how SQL Server uses forward pointers was very interesting. Thanks for the article

Mark




Post #57637
Posted Tuesday, April 1, 2003 4:49 AM
UDP Broadcaster

UDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP Broadcaster

Group: General Forum Members
Last Login: Saturday, February 28, 2009 6:51 AM
Points: 1,489, Visits: 7
quote:

I really enjoyed reading this article - it was very well written, everything explained in simple but comprehensive terms - by the end of it all I could almost hear the grunts and groans of a non-clustered index!



Thanks. Note however that I am not saying that non-clustered indexes are bad in any way, just that I think you should always have a clustered index. Plus, if you need more, some non-clustered indexes.

quote:

However, I do have one question about the suggestion of dropping an ID column as clustered so we can actually use a column that will come in more handy and make sense to use as clustered - generally speaking - indexes are known to improve performance - but at the same time, if we have too many indexes on a table, it would also slow it down as each index means more overheads on inserts, updates and deletes! How does this fit in with the clustered/non-clustered pros and cons ?!



Well, if you have a table with say one incrementing int column 'id' and one varchar column 'name', then you are probably going to need an index on both these columns, right? The id column is primary key so it will be indexed anyway, and the name is probably a column that you will use as the search argument in where clauses, so you'll need an index for that as well.

In general I would recommend that if you're just going to have one index on a table, make it a clustered one. As I said, although a clustered index really shows it's strength in range queries, it still like a non-clustered index for 'normal' queries. So it won't hurt performance to have a clustered index as opposed to a non-clustered index as the only index on a table. The problems of not having a clustered index that I described should also show that you should choose a clustered index.

--
Chris Hedgate @ Apptus Technologies (http://www.apptus.se)
http://www.sql.nu




--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/
Post #57638
Posted Tuesday, April 1, 2003 5:37 AM
UDP Broadcaster

UDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP Broadcaster

Group: General Forum Members
Last Login: Saturday, February 28, 2009 6:51 AM
Points: 1,489, Visits: 7
quote:

Great article. One disadvantage to a clustered index that I hear a lot of people claim is disk space. How much disk space does a clustered index actually take up compared to the table it is indexing?



Thanks for the compliments and for commenting. First of all, I'd say that if I had to choose between extra disk space or the problems described in the article, I'd choose disk space any time. Now, to answer your question, I'd say that extra disk space for a clustered index is normally not going to be that much of an issue.

Remember, the leaves are the actual data, even if you drop the clustered index these pages will still be (more or less) the same. So it it only the levels of the index tree above the leaves that are interesting. How much space they use of course depends on the data in your table, how much data you have and how wide your index key is. Since you need one index row for every data page (i.e. the leaves of the clustered index), this means that to determine the number of index pages at the level above the leaf level you can use this formula:

number of pages devided by (8096 bytes per index page divided by the index key row size)

This gives us the number of index pages at the level above the leaf level. You can then take this number of index pages and use that as the first part of the formula (number of pages) to determine the number of index pages needed at the next level. When you end up with just a single page you've found the root level.

In Inside SQL Server 2000 Kalen Delaney has an example of a table with 10.000 data pages (remember each page contains 8KB of data), with a clustered index on a fixed-length character column of 5 bytes. With overhead this makes each index key row 12 bytes.

10000 / (8096 / 12) = 15

This means that at the level above the leaves 15 index pages are necessary. The index rows for these 15 pages can then be contained on a single page at the level above, so that is the root page. So, the extra size for this table with a clustered index on the char(5) column is 16/10000, i.e. less than 1%. This number (1% extra size for a clustered index) is normally a good estimate, normally even a bit high as shown in the example. As always you must remember to keep index key size as small as possible of course.

PS. Note that fillfactor and padindex configurations are not taken into account here. DS.

--
Chris Hedgate @ Apptus Technologies (http://www.apptus.se)
http://www.sql.nu




--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/
Post #57639
Posted Tuesday, April 1, 2003 5:43 AM
UDP Broadcaster

UDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP BroadcasterUDP Broadcaster

Group: General Forum Members
Last Login: Saturday, February 28, 2009 6:51 AM
Points: 1,489, Visits: 7
quote:

A very useful article. I would just like to comment on the area of Page splits that was discussed. To get around the page split problem to a degree use the FILL FACTOR and PAD INDEX to set the initial level of data in the pages or when rebuilding. Levels are from 0-100%. Depending on whether the table is heavy on reads or writes as to the% of the page that is initially filled.



Thanks Mark, good comments on fillfactor and padindex. I explicitly didn't go into them in the article as it would have taken too much focus from the article. They could of course be discussed in a follow-up article.

--
Chris Hedgate @ Apptus Technologies (http://www.apptus.se)
http://www.sql.nu




--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/
Post #57640
Posted Wednesday, October 8, 2003 3:20 AM
Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Friday, September 6, 2013 2:16 AM
Points: 693, Visits: 124
quote:

Great article. One disadvantage to a clustered index that I hear a lot of people claim is disk space. How much disk space does a clustered index actually take up compared to the table it is indexing?



Jgoodwin,
from Chris Hedgate's response we should notice that the extra space needed for the clustered index is highly dependent on the key size. For the table with 10000 data pages, if the key size was 5, only 16 pages (0.2%) were needed; but if the key size were 1000, it would take about 1430 pages for the index (that's 15% more space): 8096/1007=8, 10000/8 + 10000/8/8 + 10000/8/8/8 + ... + 1 = 1430. And this is only for the clustered index itself, but the other non-clustered indexes will need extra space too.

Conclusion: you SHOULD use a clustered index, but the key for the index must be small (less than 100 bytes, ideally 4 or 8 bytes).

Razvan




Post #57641
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse