SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Clustered index, mandatory or optional?!


Clustered index, mandatory or optional?!

Author
Message
PiMané
PiMané
SSCrazy
SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)

Group: General Forum Members
Points: 2990 Visits: 1351
Hi,

I'm reviewing our database tables and found some "bad" practices like Clustered index with Uniqueidentifier columns using NewId - support for SQL2005 is mandatory.
I was woundering, and already have a post on natural keys vs surrogate keys, since there's no column on the table that can be sequential, customerCode is string and starts with the state followed by a number, name is also no sequential, .... There's no real good column for clustering and the one that is currently used is GUID.
I can't get rid of the column since it's used as FK and the tables are replicated/synchronized between similar databases. Is it best to remove the clustered index and create it as not clustered or adding an identity column clould be an option but knowing it wouldn't be used for anything else but clustered index?

Thanks,
Pedro



If you need to work better, try working less...
Orlando Colamatteo
Orlando Colamatteo
SSC-Forever
SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)

Group: General Forum Members
Points: 41202 Visits: 14413
PiMané (10/6/2012)
Hi,

I'm reviewing our database tables and found some "bad" practices like Clustered index with Uniqueidentifier columns using NewId - support for SQL2005 is mandatory.
I was woundering, and already have a post on natural keys vs surrogate keys, since there's no column on the table that can be sequential, customerCode is string and starts with the state followed by a number, name is also no sequential, .... There's no real good column for clustering and the one that is currently used is GUID.
I can't get rid of the column since it's used as FK and the tables are replicated/synchronized between similar databases. Is it best to remove the clustered index and create it as not clustered or adding an identity column clould be an option but knowing it wouldn't be used for anything else but clustered index?

Thanks,
Pedro

I would almost rather have just about anything besides a UNIQUEIDENTIFIER as a clustering key. How is the data in the table mostly queried? How wide is that customerCode and is that an oft-used column to increase selectivity? Even if it is not unique, if it is narrow (enough) and static it may be a better choice. It's impossible to say without knowing the access pattern for the data.

Some guidance: Ever-increasing clustering key - the Clustered Index Debate..........again! by Kim Tripp

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
PiMané
PiMané
SSCrazy
SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)

Group: General Forum Members
Points: 2990 Visits: 1351
Basically the PK cloud be an IDENTITY used for lookups (it's unique, small, fast in joins).
But the clustered index should be on the column(s) most used in queries (WHERE statements).
In the example I gave, CustomerCode is used a lot in searches (lookup text boxes in the application) but not used in joins (for that we have the Id).
But CustomerCode is not sequential and can generate a lot of fragmentation on the clustered index.
Is it worth creating the cluster index on CustomerCode and have maintenance plan to rebuild/reorganize the indexes over creating the cluster on the Id (the only advantage is not having RID Lookups but Key Lookups - not much of an advantage...)?

Thanks,
Pedro



If you need to work better, try working less...
PiMané
PiMané
SSCrazy
SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)

Group: General Forum Members
Points: 2990 Visits: 1351
opc.three (10/8/2012)
How wide is that customerCode and is that an oft-used column to increase selectivity?


Pardon my ignorance, not used to some technical terms in English, but what do you mean by "How wide is that customerCode"?

Thanks,
Pedro



If you need to work better, try working less...
Orlando Colamatteo
Orlando Colamatteo
SSC-Forever
SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)

Group: General Forum Members
Points: 41202 Visits: 14413
PiMané (10/8/2012)
opc.three (10/8/2012)
How wide is that customerCode and is that an oft-used column to increase selectivity?


Pardon my ignorance, not used to some technical terms in English, but what do you mean by "How wide is that customerCode"?

Thanks,
Pedro

Sorry, I could have said that in a better way. I meant to ask what is the data type (CHAR, VARCHAR, NCHAR, NVARCHAR) and if it is of variable-length, how wide is the data on average?

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
Orlando Colamatteo
Orlando Colamatteo
SSC-Forever
SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)

Group: General Forum Members
Points: 41202 Visits: 14413
PiMané (10/8/2012)
But CustomerCode is not sequential and can generate a lot of fragmentation on the clustered index.

That might be OK. If you set your fill factor appropriately you can fend off fragmentation reasonably well and still control over-consumption of memory.

Is it worth creating the cluster index on CustomerCode and have maintenance plan to rebuild/reorganize the indexes over creating the cluster on the Id (the only advantage is not having RID Lookups but Key Lookups - not much of an advantage...)?

Again it will depend on the data access patterns. The only way to know is to test the various options with a representative workload on representative hardware. Another nice thing about clustered tables is you have some control over page splits whereas with heaps you have very little control over the creation of forwarding records.

Regardless of which clustering key you choose, a good maintenance routine to keep fragmentation and page density at optimal levels will be critical in any scenario.

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
PiMané
PiMané
SSCrazy
SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)

Group: General Forum Members
Points: 2990 Visits: 1351
opc.three (10/8/2012)
PiMané (10/8/2012)
opc.three (10/8/2012)
How wide is that customerCode and is that an oft-used column to increase selectivity?


Pardon my ignorance, not used to some technical terms in English, but what do you mean by "How wide is that customerCode"?

Thanks,
Pedro

Sorry, I could have said that in a better way. I meant to ask what is the data type (CHAR, VARCHAR, NCHAR, NVARCHAR) and if it is of variable-length, how wide is the data on average?


:-) the column is NVARCHAR(20) but mostly used always 10 chars = 2 letters + 8 numbers.
But this is just a case study since other our clients can have only numbers or only chars without any pattern.
That's why it's hard to determine if the column is sequential or not... probably on 80% is sequential, starts at 000001 and goes on... In this case the column would be a very good cluster index...
When we don't know how the clustered index column(s) will be filled is there any FILLFACTOR value appropriate or every case is a different one?

Thanks,
Pedro



If you need to work better, try working less...
Orlando Colamatteo
Orlando Colamatteo
SSC-Forever
SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)SSC-Forever (41K reputation)

Group: General Forum Members
Points: 41202 Visits: 14413
Appropriate for all? No. Pick a spot, maybe 90, monitor fragmentation, and adjust up or down.

I would be careful using a variable length column in a PK, especially if their values are not static. Test, test, test.

__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
PiMané
PiMané
SSCrazy
SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)SSCrazy (3K reputation)

Group: General Forum Members
Points: 2990 Visits: 1351
opc.three (10/9/2012)
Appropriate for all? No. Pick a spot, maybe 90, monitor fragmentation, and adjust up or down.

Will do that... the problem is that our SW is installed on several customers with different work methodology, we can have customers with 100 rows per table, 1.000 or 10.000 and even 1.000.000.
But since the ones that complaint are the ones with large databases probably it's worth just looking into those cases..

I would be careful using a variable length column in a PK, especially if their values are not static. Test, test, test.

Our PK are usually always INT IDENTITY or GUID, never a NVARCHAR.
We only have indexes over columns with NVARCHAR but not PKs. All lookups, FKs, are made usings INTs or GUIDs

Thanks,
Pedro



If you need to work better, try working less...
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search