Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Clustered Indexes? Sedimentary, my dear Watson Expand / Collapse
Author
Message
Posted Monday, May 24, 2010 2:06 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Thursday, November 24, 2011 2:31 AM
Points: 169, Visits: 156
It's good to see such sensible advice here. Yes, a table should always have a clustered index, and IMO an Identity helps solve many of the issues with page splits etc that are referenced in the article.

The key to all of this is to ensure that the Identity column is there, and most importantly that it has some meaning within the database.

If you can design your database around identity columns, and have them actually mean something in the context of your system, then you have a clustered index that can be used for searches and joins, whilst also having them organise your tables data.

Personally, I have used this approach on some very large systems. It works.
Post #926652
Posted Monday, May 24, 2010 6:20 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Thursday, February 7, 2013 11:44 AM
Points: 51, Visits: 40
How I do things:

The choice of IDENTITY over something really depends on whether there is already a user defined key. For example, concerning trucking, the Terminal table would use the user defined TerminalID (NV5) as the PK and not an Identity column. CustomerID however would be an IDENTITY column.



Post #926770
Posted Monday, May 24, 2010 7:34 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, October 22, 2012 8:55 AM
Points: 124, Visits: 180
We have a DB which is heavily queried in many bizarre ways (user-definable queries). After a lot of profiling, we found that the IDENTITY of one table was NOT the best clustered index, it was three other fields which are almost always used together, so we changed the Clustered index over.

The next morning, the system was effectively frozen. According to the profiler, this should have been the 'perfect' solution. Unfortunately what we'd neglected to fully take into account was that although more than 60% of the queries were SARGed on that triplet, 40% were on other columns, including the ID, and the "Key Lookup" cost which was being incurred was now amazingly prohibitive.

For OLTP systems which drive UIs, The main problem is that the App typically pulls out most of the columns for display. While having the Clustered Index be something a bit complex (or non-standard) fixes the "bulk" of the queries, it can severely impact other queries, especially if the UI allows searches such Search By multiple (optional) arguments.

We also noticed that once the CI was moved, we had to add a lot of new statistics to 'fill in' for the old CI.

For this reason, unless there's a good reason, we take the "hot spot" hit (low trans/sec, so not too bad). It may not be perfect, but if your system grows 'organically' (i.e. the developers add queries and screens without a Change Management Board), a non-IDENTITY CI may do more harm than good.






Post #926829
Posted Monday, May 24, 2010 8:35 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Monday, October 4, 2010 9:44 AM
Points: 2, Visits: 14
I agree with one of the comment above that stated we need to avoid over generalization. There are many design factors that need to be considered. I personally believe that you need to focus on what adds value to the process. If adding an Identity column as primary key will serve some functional purpose then by all means used it.

If you put it there just for the sake of creating an easy to use ordering and it serves no real value, you may want to look more closely as see if there are better candidates, 'natural keys' that could be utilized.

In general, I like the the identity based clustered index, but only if it has a good reason to exist. Thus, when I design the system, I plan to use them in a meaningful way so that it add value.
Post #926882
Posted Monday, May 24, 2010 12:19 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, March 29, 2011 12:37 PM
Points: 2, Visits: 75
I agree with the posts that says "should not be generalized". In my case i have few sales tables with millions of records. We load data early morning and have few inserts/updates throughout the day but have heavy reads through reports. Each table has a unique indentity column but i have put the clustered index on date and location columns (composite) and few more nonclustered indexes on other columns. This arrangement serves the best in our scenario.
Post #927009
Posted Monday, May 24, 2010 12:21 PM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Today @ 10:16 AM
Points: 17,617, Visits: 15,471
For me it depends is a good answer. I believe tables should have a clustered index (much the same reasons as Gail). I do not always create it on an ever increasing integer field - but that is the case most of the time.



Jason AKA CirqueDeSQLeil
I have given a name to my pain...
MCM SQL Server


SQL RNNR

Posting Performance Based Questions - Gail Shaw
Posting Data Etiquette - Jeff Moden
Hidden RBAR - Jeff Moden
VLFs and the Tran Log - Kimberly Tripp
Post #927013
Posted Monday, May 24, 2010 2:37 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, April 3, 2014 7:15 PM
Points: 16, Visits: 67
I have a table in a database with hundreds of millions of rows. The records are quite small (30 bytes) with a clustered index on TagID (Int) and SampleDateTime (BigInt). I didn't design the table but I am almost convinced the the clustered index should be reversed because SampleDataTime has much greater uniqueness, always increases (obviously) and we ALWAYS specify a DateTime range in our queries.

I experimented and found that the execution time for queries that specify a DateTime range but no TagIDs came down from 45 seconds to less than 1 second. The only problem is that most existing queries take much longer, presumably because they are structured for the original clustered index,
Post #927080
Posted Tuesday, May 25, 2010 9:35 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Thursday, May 27, 2010 8:26 AM
Points: 1, Visits: 12
I agree with the microsoft approach... but mainly because when I design tables... I reference them through the Identity keys from the get go. Therefore having a clustered index on a sequential ID provides a double benefit when you actually reference a table through the ID key. For name or address searches, I recommend non-clustered index keys that taylor to both the where and select clauses of a common SQL search. The main thing you want to avoid is bookmark-lookups, because they are expensive.

But hey, if there was only one right way of doing things then we wouldn't need programmers or DBAs.
Post #927600
Posted Tuesday, May 25, 2010 12:40 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Monday, October 22, 2012 8:55 AM
Points: 124, Visits: 180
garnet.fehr (5/25/2010)


But hey, if there was only one right way of doing things then we wouldn't need programmers or DBAs.


There'll always be a need for DBAs to slap the programmers upside the head when they do it in the most boneheaded RBAR way imaginable.

Post #927741
Posted Tuesday, May 25, 2010 3:48 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Monday, September 15, 2014 3:12 PM
Points: 70, Visits: 804
I've just been testing fragmentation on a table that has a clustered index on a guid (ouch!), not my design. But I've found that with an appropriate fill factor, in this case 90%, adding 100,000 rows to a table of 8 million fragments that index just a fraction of a percent.

I conclude that page splits can be minimized with an appropriate fill factor. I also defrag all indexes that are more than 10% fragmented on a weekly basis.

Too often I think the clustering on the identity column is done out of laziness - not spending the time to analyze the queries hittng the table to determine the best choice for clustering.
Post #927842
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse