SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Cluster That Index!


Cluster That Index!

Author
Message
Chris Hedgate
Chris Hedgate
SSCrazy
SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)

Group: General Forum Members
Points: 2439 Visits: 7
Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/chedgate/clusterthatindex.asp

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/
virasol
virasol
Old Hand
Old Hand (320 reputation)Old Hand (320 reputation)Old Hand (320 reputation)Old Hand (320 reputation)Old Hand (320 reputation)Old Hand (320 reputation)Old Hand (320 reputation)Old Hand (320 reputation)

Group: General Forum Members
Points: 320 Visits: 1
Thanks for the great article.
Btw: the link to script 1 is broken, there is a space in '.t xt' and the URL for script2.txt does not appear to exist... probably because script2 seems to be included at the bottom of script1.txt



Edited by - virasol on 03/30/2003 3:49:51 PM

Edited by - virasol on 03/30/2003 3:51:00 PM



Chris Hedgate
Chris Hedgate
SSCrazy
SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)

Group: General Forum Members
Points: 2439 Visits: 7
quote:

Thanks for the great article.
Btw: the link to script 1 is broken, there is a space in '.t xt' and the URL for script2.txt does not appear to exist... probably because script2 seems to be included at the bottom of script1.txt



Thanks. I've notified Andy about the broken links, hopefully they'll be fixed later today.

--
Chris Hedgate @ Apptus Technologies (http://www.apptus.se)
http://www.sql.nu

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/
sushila
sushila
Hall of Fame
Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)Hall of Fame (3.7K reputation)

Group: General Forum Members
Points: 3681 Visits: 639
Christoffer,

I really enjoyed reading this article - it was very well written, everything explained in simple but comprehensive terms - by the end of it all I could almost hear the grunts and groans of a non-clustered index!
However, I do have one question about the suggestion of dropping an ID column as clustered so we can actually use a column that will come in more handy and make sense to use as clustered - generally speaking - indexes are known to improve performance - but at the same time, if we have too many indexes on a table, it would also slow it down as each index means more overheads on inserts, updates and deletes! How does this fit in with the clustered/non-clustered pros and cons ?!







**ASCII stupid question, get a stupid ANSI !!!**
jgoodwin
jgoodwin
Valued Member
Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)Valued Member (58 reputation)

Group: General Forum Members
Points: 58 Visits: 10
Great article. One disadvantage to a clustered index that I hear a lot of people claim is disk space. How much disk space does a clustered index actually take up compared to the table it is indexing?



malman
malman
Grasshopper
Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)

Group: General Forum Members
Points: 16 Visits: 1
A very useful article. I would just like to comment on the area of Page splits that was discussed. To get around the page split problem to a degree use the FILL FACTOR and PAD INDEX to set the initial level of data in the pages or when rebuilding. Levels are from 0-100%. Depending on whether the table is heavy on reads or writes as to the% of the page that is initially filled.

I was unaware that heaps could not be fragmented and the explanation on how SQL Server uses forward pointers was very interesting. Thanks for the article

Mark



Chris Hedgate
Chris Hedgate
SSCrazy
SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)

Group: General Forum Members
Points: 2439 Visits: 7
quote:

I really enjoyed reading this article - it was very well written, everything explained in simple but comprehensive terms - by the end of it all I could almost hear the grunts and groans of a non-clustered index!



Thanks. Note however that I am not saying that non-clustered indexes are bad in any way, just that I think you should always have a clustered index. Plus, if you need more, some non-clustered indexes.

quote:

However, I do have one question about the suggestion of dropping an ID column as clustered so we can actually use a column that will come in more handy and make sense to use as clustered - generally speaking - indexes are known to improve performance - but at the same time, if we have too many indexes on a table, it would also slow it down as each index means more overheads on inserts, updates and deletes! How does this fit in with the clustered/non-clustered pros and cons ?!



Well, if you have a table with say one incrementing int column 'id' and one varchar column 'name', then you are probably going to need an index on both these columns, right? The id column is primary key so it will be indexed anyway, and the name is probably a column that you will use as the search argument in where clauses, so you'll need an index for that as well.

In general I would recommend that if you're just going to have one index on a table, make it a clustered one. As I said, although a clustered index really shows it's strength in range queries, it still like a non-clustered index for 'normal' queries. So it won't hurt performance to have a clustered index as opposed to a non-clustered index as the only index on a table. The problems of not having a clustered index that I described should also show that you should choose a clustered index.

--
Chris Hedgate @ Apptus Technologies (http://www.apptus.se)
http://www.sql.nu

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/
Chris Hedgate
Chris Hedgate
SSCrazy
SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)

Group: General Forum Members
Points: 2439 Visits: 7
quote:

Great article. One disadvantage to a clustered index that I hear a lot of people claim is disk space. How much disk space does a clustered index actually take up compared to the table it is indexing?



Thanks for the compliments and for commenting. First of all, I'd say that if I had to choose between extra disk space or the problems described in the article, I'd choose disk space any time. Now, to answer your question, I'd say that extra disk space for a clustered index is normally not going to be that much of an issue.

Remember, the leaves are the actual data, even if you drop the clustered index these pages will still be (more or less) the same. So it it only the levels of the index tree above the leaves that are interesting. How much space they use of course depends on the data in your table, how much data you have and how wide your index key is. Since you need one index row for every data page (i.e. the leaves of the clustered index), this means that to determine the number of index pages at the level above the leaf level you can use this formula:

number of pages devided by (8096 bytes per index page divided by the index key row size)

This gives us the number of index pages at the level above the leaf level. You can then take this number of index pages and use that as the first part of the formula (number of pages) to determine the number of index pages needed at the next level. When you end up with just a single page you've found the root level.

In Inside SQL Server 2000 Kalen Delaney has an example of a table with 10.000 data pages (remember each page contains 8KB of data), with a clustered index on a fixed-length character column of 5 bytes. With overhead this makes each index key row 12 bytes.

10000 / (8096 / 12) = 15

This means that at the level above the leaves 15 index pages are necessary. The index rows for these 15 pages can then be contained on a single page at the level above, so that is the root page. So, the extra size for this table with a clustered index on the char(5) column is 16/10000, i.e. less than 1%. This number (1% extra size for a clustered index) is normally a good estimate, normally even a bit high as shown in the example. As always you must remember to keep index key size as small as possible of course.

PS. Note that fillfactor and padindex configurations are not taken into account here. DS.

--
Chris Hedgate @ Apptus Technologies (http://www.apptus.se)
http://www.sql.nu

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/
Chris Hedgate
Chris Hedgate
SSCrazy
SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)SSCrazy (2.4K reputation)

Group: General Forum Members
Points: 2439 Visits: 7
quote:

A very useful article. I would just like to comment on the area of Page splits that was discussed. To get around the page split problem to a degree use the FILL FACTOR and PAD INDEX to set the initial level of data in the pages or when rebuilding. Levels are from 0-100%. Depending on whether the table is heavy on reads or writes as to the% of the page that is initially filled.



Thanks Mark, good comments on fillfactor and padindex. I explicitly didn't go into them in the article as it would have taken too much focus from the article. They could of course be discussed in a follow-up article.

--
Chris Hedgate @ Apptus Technologies (http://www.apptus.se)
http://www.sql.nu

--
Chris Hedgate http://www.hedgate.net/
Contributor to the Best of SQL Server Central volumes
Articles: http://www.sqlservercentral.com/columnists/chedgate/
Razvan Socol
Razvan Socol
SSC Eights!
SSC Eights! (849 reputation)SSC Eights! (849 reputation)SSC Eights! (849 reputation)SSC Eights! (849 reputation)SSC Eights! (849 reputation)SSC Eights! (849 reputation)SSC Eights! (849 reputation)SSC Eights! (849 reputation)

Group: General Forum Members
Points: 849 Visits: 129
quote:

Great article. One disadvantage to a clustered index that I hear a lot of people claim is disk space. How much disk space does a clustered index actually take up compared to the table it is indexing?



Jgoodwin,
from Chris Hedgate's response we should notice that the extra space needed for the clustered index is highly dependent on the key size. For the table with 10000 data pages, if the key size was 5, only 16 pages (0.2%) were needed; but if the key size were 1000, it would take about 1430 pages for the index (that's 15% more space): 8096/1007=8, 10000/8 + 10000/8/8 + 10000/8/8/8 + ... + 1 = 1430. And this is only for the clustered index itself, but the other non-clustered indexes will need extra space too.

Conclusion: you SHOULD use a clustered index, but the key for the index must be small (less than 100 bytes, ideally 4 or 8 bytes).

Razvan



Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search