Log in
::
Register
::
Not logged in
Home
Tags
Articles
Editorials
Stairways
Forums
Scripts
Videos
Blogs
QotD
Books
Ask SSC
SQL Jobs
Training
Authors
About us
Contact us
Newsletters
Write for us
Recent Posts
Recent Posts
Popular Topics
Popular Topics
Home
Search
Members
Calendar
Who's On
Home
»
SQL Server 2008
»
SQL Server 2008 - General
»
Clustered index, mandatory or optional?!
Clustered index, mandatory or optional?!
Rate Topic
Display Mode
Topic Options
Author
Message
PiMané
PiMané
Posted Saturday, October 06, 2012 3:01 PM
SSC-Addicted
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 5:02 AM
Points: 403,
Visits: 904
Hi,
I'm reviewing our database tables and found some "bad" practices like Clustered index with Uniqueidentifier columns using NewId - support for SQL2005 is mandatory.
I was woundering, and already have a post on natural keys vs surrogate keys, since there's no column on the table that can be sequential, customerCode is string and starts with the state followed by a number, name is also no sequential, .... There's no real good column for clustering and the one that is currently used is GUID.
I can't get rid of the column since it's used as FK and the tables are replicated/synchronized between similar databases. Is it best to remove the clustered index and create it as not clustered or adding an identity column clould be an option but knowing it wouldn't be used for anything else but clustered index?
Thanks,
Pedro
If you need to work better, try working less...
Post #1369466
opc.three
opc.three
Posted Monday, October 08, 2012 2:26 PM
SSCertifiable
Group: General Forum Members
Last Login: Today @ 1:07 PM
Points: 6,826,
Visits: 11,951
PiMané (10/6/2012)
Hi,
I'm reviewing our database tables and found some "bad" practices like Clustered index with Uniqueidentifier columns using NewId - support for SQL2005 is mandatory.
I was woundering, and already have a post on natural keys vs surrogate keys, since there's no column on the table that can be sequential, customerCode is string and starts with the state followed by a number, name is also no sequential, .... There's no real good column for clustering and the one that is currently used is GUID.
I can't get rid of the column since it's used as FK and the tables are replicated/synchronized between similar databases. Is it best to remove the clustered index and create it as not clustered or adding an identity column clould be an option but knowing it wouldn't be used for anything else but clustered index?
Thanks,
Pedro
I would
almost
rather have just about anything besides a UNIQUEIDENTIFIER as a clustering key. How is the data in the table mostly queried? How wide is that customerCode and is that an oft-used column to increase selectivity? Even if it is not unique, if it is narrow (enough) and static it may be a better choice. It's impossible to say without knowing the access pattern for the data.
Some guidance:
Ever-increasing clustering key - the Clustered Index Debate..........again! by Kim Tripp
__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Believe you can and you're halfway there.
--Theodore Roosevelt
Everything Should Be Made as Simple as Possible, But Not Simpler
--Albert Einstein
The significant problems we face cannot be solved at the same level of thinking we were at when we created them.
--Albert Einstein
1 apple is not exactly 1/8 of 8 apples. Because there are no absolutely identical apples.
--Giordy
Post #1370041
PiMané
PiMané
Posted Monday, October 08, 2012 2:43 PM
SSC-Addicted
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 5:02 AM
Points: 403,
Visits: 904
Basically the PK cloud be an IDENTITY used for lookups (it's unique, small, fast in joins).
But the clustered index should be on the column(s) most used in queries (WHERE statements).
In the example I gave, CustomerCode is used a lot in searches (lookup text boxes in the application) but not used in joins (for that we have the Id).
But CustomerCode is not sequential and can generate a lot of fragmentation on the clustered index.
Is it worth creating the cluster index on CustomerCode and have maintenance plan to rebuild/reorganize the indexes over creating the cluster on the Id (the only advantage is not having RID Lookups but Key Lookups - not much of an advantage...)?
Thanks,
Pedro
If you need to work better, try working less...
Post #1370050
PiMané
PiMané
Posted Monday, October 08, 2012 2:46 PM
SSC-Addicted
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 5:02 AM
Points: 403,
Visits: 904
opc.three (10/8/2012)
How wide is that customerCode and is that an oft-used column to increase selectivity?
Pardon my ignorance, not used to some technical terms in English, but what do you mean by "How wide is that customerCode"?
Thanks,
Pedro
If you need to work better, try working less...
Post #1370052
opc.three
opc.three
Posted Monday, October 08, 2012 2:58 PM
SSCertifiable
Group: General Forum Members
Last Login: Today @ 1:07 PM
Points: 6,826,
Visits: 11,951
PiMané (10/8/2012)
opc.three (10/8/2012)
How wide is that customerCode and is that an oft-used column to increase selectivity?
Pardon my ignorance, not used to some technical terms in English, but what do you mean by "How wide is that customerCode"?
Thanks,
Pedro
Sorry, I could have said that in a better way. I meant to ask what is the data type (CHAR, VARCHAR, NCHAR, NVARCHAR) and if it is of variable-length, how wide is the data on average?
__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Believe you can and you're halfway there.
--Theodore Roosevelt
Everything Should Be Made as Simple as Possible, But Not Simpler
--Albert Einstein
The significant problems we face cannot be solved at the same level of thinking we were at when we created them.
--Albert Einstein
1 apple is not exactly 1/8 of 8 apples. Because there are no absolutely identical apples.
--Giordy
Post #1370062
opc.three
opc.three
Posted Monday, October 08, 2012 3:05 PM
SSCertifiable
Group: General Forum Members
Last Login: Today @ 1:07 PM
Points: 6,826,
Visits: 11,951
PiMané (10/8/2012)
But CustomerCode is not sequential and can generate a lot of fragmentation on the clustered index.
That might be OK. If you set your fill factor appropriately you can fend off fragmentation reasonably well and still control over-consumption of memory.
Is it worth creating the cluster index on CustomerCode and have maintenance plan to rebuild/reorganize the indexes over creating the cluster on the Id (the only advantage is not having RID Lookups but Key Lookups - not much of an advantage...)?
Again it will depend on the data access patterns. The only way to know is to test the various options with a representative workload on representative hardware. Another nice thing about clustered tables is you have some control over page splits whereas with heaps you have very little control over the creation of forwarding records.
Regardless of which clustering key you choose, a good maintenance routine to keep fragmentation and page density at optimal levels will be critical in any scenario.
__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Believe you can and you're halfway there.
--Theodore Roosevelt
Everything Should Be Made as Simple as Possible, But Not Simpler
--Albert Einstein
The significant problems we face cannot be solved at the same level of thinking we were at when we created them.
--Albert Einstein
1 apple is not exactly 1/8 of 8 apples. Because there are no absolutely identical apples.
--Giordy
Post #1370070
PiMané
PiMané
Posted Tuesday, October 09, 2012 2:32 AM
SSC-Addicted
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 5:02 AM
Points: 403,
Visits: 904
opc.three (10/8/2012)
PiMané (10/8/2012)
opc.three (10/8/2012)
How wide is that customerCode and is that an oft-used column to increase selectivity?
Pardon my ignorance, not used to some technical terms in English, but what do you mean by "How wide is that customerCode"?
Thanks,
Pedro
Sorry, I could have said that in a better way. I meant to ask what is the data type (CHAR, VARCHAR, NCHAR, NVARCHAR) and if it is of variable-length, how wide is the data on average?
the column is NVARCHAR(20) but mostly used always 10 chars = 2 letters + 8 numbers.
But this is just a case study since other our clients can have only numbers or only chars without any pattern.
That's why it's hard to determine if the column is sequential or not... probably on 80% is sequential, starts at 000001 and goes on... In this case the column would be a very good cluster index...
When we don't know how the clustered index column(s) will be filled is there any FILLFACTOR value appropriate or every case is a different one?
Thanks,
Pedro
If you need to work better, try working less...
Post #1370216
opc.three
opc.three
Posted Tuesday, October 09, 2012 6:04 AM
SSCertifiable
Group: General Forum Members
Last Login: Today @ 1:07 PM
Points: 6,826,
Visits: 11,951
Appropriate for all? No. Pick a spot, maybe 90, monitor fragmentation, and adjust up or down.
I would be careful using a variable length column in a PK, especially if their values are not static. Test, test, test.
__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community.
--Plato
Believe you can and you're halfway there.
--Theodore Roosevelt
Everything Should Be Made as Simple as Possible, But Not Simpler
--Albert Einstein
The significant problems we face cannot be solved at the same level of thinking we were at when we created them.
--Albert Einstein
1 apple is not exactly 1/8 of 8 apples. Because there are no absolutely identical apples.
--Giordy
Post #1370347
PiMané
PiMané
Posted Tuesday, October 09, 2012 7:16 AM
SSC-Addicted
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 5:02 AM
Points: 403,
Visits: 904
opc.three (10/9/2012)
Appropriate for all? No. Pick a spot, maybe 90, monitor fragmentation, and adjust up or down.
Will do that... the problem is that our SW is installed on several customers with different work methodology, we can have customers with 100 rows per table, 1.000 or 10.000 and even 1.000.000.
But since the ones that complaint are the ones with large databases probably it's worth just looking into those cases..
I would be careful using a variable length column in a PK, especially if their values are not static. Test, test, test.
Our PK are usually always INT IDENTITY or GUID, never a NVARCHAR.
We only have indexes over columns with NVARCHAR but not PKs. All lookups, FKs, are made usings INTs or GUIDs
Thanks,
Pedro
If you need to work better, try working less...
Post #1370381
« Prev Topic
|
Next Topic »
Permissions
You
cannot
post new topics.
You
cannot
post topic replies.
You
cannot
post new polls.
You
cannot
post replies to polls.
You
cannot
edit your own topics.
You
cannot
delete your own topics.
You
cannot
edit other topics.
You
cannot
delete other topics.
You
cannot
edit your own posts.
You
cannot
edit other posts.
You
cannot
delete your own posts.
You
cannot
delete other posts.
You
cannot
post events.
You
cannot
edit your own events.
You
cannot
edit other events.
You
cannot
delete your own events.
You
cannot
delete other events.
You
cannot
send private messages.
You
cannot
send emails.
You
may
read topics.
You
cannot
rate topics.
You
cannot
vote within polls.
You
cannot
upload attachments.
You
may
download attachments.
You
cannot
post HTML code.
You
cannot
edit HTML code.
You
cannot
post IFCode.
You
cannot
post JavaScript.
You
cannot
post EmotIcons.
You
cannot
post or upload images.
Copyright © 2002-2013 Simple Talk Publishing. All Rights Reserved.
Privacy Policy.
Terms of Use.
Report Abuse.