Log in
::
Register
::
Not logged in
Home
Tags
Articles
Editorials
Stairways
Forums
Scripts
Videos
Blogs
QotD
Books
Ask SSC
SQL Jobs
Training
Authors
About us
Contact us
Newsletters
Write for us
Recent Posts
Recent Posts
Popular Topics
Popular Topics
Home
Search
Members
Calendar
Who's On
Home
»
SQLServerCentral.com
»
Editorials
»
Clustered Indexes? Sedimentary, my dear...
50 posts, Page 2 of 5
««
1
2
3
4
5
»
»»
Clustered Indexes? Sedimentary, my dear Watson
Rate Topic
Display Mode
Topic Options
Author
Message
Mark Hickin
Mark Hickin
Posted Monday, May 24, 2010 2:06 AM
SSC-Enthusiastic
Group: General Forum Members
Last Login: Thursday, November 24, 2011 2:31 AM
Points: 169,
Visits: 156
It's good to see such sensible advice here. Yes, a table should always have a clustered index, and IMO an Identity helps solve many of the issues with page splits etc that are referenced in the article.
The key to all of this is to ensure that the Identity column is there, and most importantly that it has some meaning within the database.
If you can design your database around identity columns, and have them actually mean something in the context of your system, then you have a clustered index that can be used for searches and joins, whilst also having them organise your tables data.
Personally, I have used this approach on some very large systems. It works.
Post #926652
jwainz
jwainz
Posted Monday, May 24, 2010 6:20 AM
Valued Member
Group: General Forum Members
Last Login: Thursday, February 07, 2013 11:44 AM
Points: 51,
Visits: 40
How I do things:
The choice of IDENTITY over something really depends on whether there is already a user defined key. For example, concerning trucking, the Terminal table would use the user defined TerminalID (NV5) as the PK and not an Identity column. CustomerID however would be an IDENTITY column.
Post #926770
Nick W*
Nick W*
Posted Monday, May 24, 2010 7:34 AM
SSC-Enthusiastic
Group: General Forum Members
Last Login: Monday, October 22, 2012 8:55 AM
Points: 124,
Visits: 180
We have a DB which is heavily queried in many bizarre ways (user-definable queries). After a lot of profiling, we found that the IDENTITY of one table was
NOT
the best clustered index, it was
three
other fields which are almost always used together, so we changed the Clustered index over.
The next morning, the system was effectively frozen. According to the profiler, this should have been the 'perfect' solution. Unfortunately what we'd neglected to fully take into account was that although more than 60% of the queries were SARGed on that triplet, 40% were on other columns, including the ID, and the "Key Lookup" cost which was being incurred was now amazingly prohibitive.
For OLTP systems which drive UIs, The main problem is that the App typically pulls out most of the columns for display. While having the Clustered Index be something a bit complex (or non-standard) fixes the "bulk" of the queries, it can
severely
impact other queries, especially if the UI allows searches such Search By multiple (optional) arguments.
We also noticed that once the CI was moved, we had to add a lot of new statistics to 'fill in' for the old CI.
For this reason, unless there's a good reason, we take the "hot spot" hit (low trans/sec, so not too bad). It may not be perfect, but if your system grows 'organically' (i.e. the developers add queries and screens without a Change Management Board), a non-IDENTITY CI may do more harm than good.
Post #926829
marvin.meints
marvin.meints
Posted Monday, May 24, 2010 8:35 AM
Forum Newbie
Group: General Forum Members
Last Login: Monday, October 04, 2010 9:44 AM
Points: 2,
Visits: 14
I agree with one of the comment above that stated we need to avoid over generalization. There are many design factors that need to be considered. I personally believe that you need to focus on what adds value to the process. If adding an Identity column as primary key will serve some functional purpose then by all means used it.
If you put it there just for the sake of creating an easy to use ordering and it serves no real value, you may want to look more closely as see if there are better candidates, 'natural keys' that could be utilized.
In general, I like the the identity based clustered index, but only if it has a good reason to exist. Thus, when I design the system, I plan to use them in a meaningful way so that it add value.
Post #926882
rupeshp
rupeshp
Posted Monday, May 24, 2010 12:19 PM
Forum Newbie
Group: General Forum Members
Last Login: Tuesday, March 29, 2011 12:37 PM
Points: 2,
Visits: 75
I agree with the posts that says "should not be generalized". In my case i have few sales tables with millions of records. We load data early morning and have few inserts/updates throughout the day but have heavy reads through reports. Each table has a unique indentity column but i have put the clustered index on date and location columns (composite) and few more nonclustered indexes on other columns. This arrangement serves the best in our scenario.
Post #927009
SQLRNNR
SQLRNNR
Posted Monday, May 24, 2010 12:21 PM
SSCoach
Group: General Forum Members
Last Login: 2 days ago @ 1:07 PM
Points: 18,733,
Visits: 12,332
For me it depends is a good answer. I believe tables should have a clustered index (much the same reasons as Gail). I do not always create it on an ever increasing integer field - but that is the case most of the time.
Jason
AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server 2008
SQL RNNR
Posting Performance Based Questions - Gail Shaw
Posting Data Etiquette - Jeff Moden
Hidden RBAR - Jeff Moden
VLFs and the Tran Log - Kimberly Tripp
Post #927013
Richard Campbell
Richard Campbell
Posted Monday, May 24, 2010 2:37 PM
Grasshopper
Group: General Forum Members
Last Login: Tuesday, October 16, 2012 6:01 PM
Points: 16,
Visits: 66
I have a table in a database with hundreds of millions of rows. The records are quite small (30 bytes) with a clustered index on TagID (Int) and SampleDateTime (BigInt). I didn't design the table but I am almost convinced the the clustered index should be reversed because SampleDataTime has much greater uniqueness, always increases (obviously) and we ALWAYS specify a DateTime range in our queries.
I experimented and found that the execution time for queries that specify a DateTime range but no TagIDs came down from 45 seconds to less than 1 second. The only problem is that most existing queries take much longer, presumably because they are structured for the original clustered index,
Post #927080
garnet.fehr
garnet.fehr
Posted Tuesday, May 25, 2010 9:35 AM
Forum Newbie
Group: General Forum Members
Last Login: Thursday, May 27, 2010 8:26 AM
Points: 1,
Visits: 12
I agree with the microsoft approach... but mainly because when I design tables... I reference them through the Identity keys from the get go. Therefore having a clustered index on a sequential ID provides a double benefit when you actually reference a table through the ID key. For name or address searches, I recommend non-clustered index keys that taylor to both the where and select clauses of a common SQL search. The main thing you want to avoid is bookmark-lookups, because they are expensive.
But hey, if there was only one right way of doing things then we wouldn't need programmers or DBAs.
Post #927600
Nick W*
Nick W*
Posted Tuesday, May 25, 2010 12:40 PM
SSC-Enthusiastic
Group: General Forum Members
Last Login: Monday, October 22, 2012 8:55 AM
Points: 124,
Visits: 180
garnet.fehr (5/25/2010)
But hey, if there was only one right way of doing things then we wouldn't need programmers or DBAs.
There'll always be a need for DBAs to slap the programmers upside the head when they do it in the most boneheaded RBAR way imaginable.
Post #927741
don.schaeffer
don.schaeffer
Posted Tuesday, May 25, 2010 3:48 PM
Valued Member
Group: General Forum Members
Last Login: 2 days ago @ 2:56 PM
Points: 60,
Visits: 634
I've just been testing fragmentation on a table that has a clustered index on a guid (ouch!), not my design. But I've found that with an appropriate fill factor, in this case 90%, adding 100,000 rows to a table of 8 million fragments that index just a fraction of a percent.
I conclude that page splits can be minimized with an appropriate fill factor. I also defrag all indexes that are more than 10% fragmented on a weekly basis.
Too often I think the clustering on the identity column is done out of laziness - not spending the time to analyze the queries hittng the table to determine the best choice for clustering.
Post #927842
« Prev Topic
|
Next Topic »
50 posts, Page 2 of 5
««
1
2
3
4
5
»
»»
Permissions
You
cannot
post new topics.
You
cannot
post topic replies.
You
cannot
post new polls.
You
cannot
post replies to polls.
You
cannot
edit your own topics.
You
cannot
delete your own topics.
You
cannot
edit other topics.
You
cannot
delete other topics.
You
cannot
edit your own posts.
You
cannot
edit other posts.
You
cannot
delete your own posts.
You
cannot
delete other posts.
You
cannot
post events.
You
cannot
edit your own events.
You
cannot
edit other events.
You
cannot
delete your own events.
You
cannot
delete other events.
You
cannot
send private messages.
You
cannot
send emails.
You
may
read topics.
You
cannot
rate topics.
You
cannot
vote within polls.
You
cannot
upload attachments.
You
may
download attachments.
You
cannot
post HTML code.
You
cannot
edit HTML code.
You
cannot
post IFCode.
You
cannot
post JavaScript.
You
cannot
post EmotIcons.
You
cannot
post or upload images.
Copyright © 2002-2013 Simple Talk Publishing. All Rights Reserved.
Privacy Policy.
Terms of Use.
Report Abuse.