Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Clustered index, mandatory or optional?! Expand / Collapse
Author
Message
Posted Saturday, October 6, 2012 3:01 PM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, October 1, 2014 2:30 PM
Points: 515, Visits: 1,138
Hi,

I'm reviewing our database tables and found some "bad" practices like Clustered index with Uniqueidentifier columns using NewId - support for SQL2005 is mandatory.
I was woundering, and already have a post on natural keys vs surrogate keys, since there's no column on the table that can be sequential, customerCode is string and starts with the state followed by a number, name is also no sequential, .... There's no real good column for clustering and the one that is currently used is GUID.
I can't get rid of the column since it's used as FK and the tables are replicated/synchronized between similar databases. Is it best to remove the clustered index and create it as not clustered or adding an identity column clould be an option but knowing it wouldn't be used for anything else but clustered index?

Thanks,
Pedro




If you need to work better, try working less...
Post #1369466
Posted Monday, October 8, 2012 2:26 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 7:06 PM
Points: 7,125, Visits: 12,720
PiMané (10/6/2012)
Hi,

I'm reviewing our database tables and found some "bad" practices like Clustered index with Uniqueidentifier columns using NewId - support for SQL2005 is mandatory.
I was woundering, and already have a post on natural keys vs surrogate keys, since there's no column on the table that can be sequential, customerCode is string and starts with the state followed by a number, name is also no sequential, .... There's no real good column for clustering and the one that is currently used is GUID.
I can't get rid of the column since it's used as FK and the tables are replicated/synchronized between similar databases. Is it best to remove the clustered index and create it as not clustered or adding an identity column clould be an option but knowing it wouldn't be used for anything else but clustered index?

Thanks,
Pedro

I would almost rather have just about anything besides a UNIQUEIDENTIFIER as a clustering key. How is the data in the table mostly queried? How wide is that customerCode and is that an oft-used column to increase selectivity? Even if it is not unique, if it is narrow (enough) and static it may be a better choice. It's impossible to say without knowing the access pattern for the data.

Some guidance: Ever-increasing clustering key - the Clustered Index Debate..........again! by Kim Tripp


__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
Post #1370041
Posted Monday, October 8, 2012 2:43 PM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, October 1, 2014 2:30 PM
Points: 515, Visits: 1,138
Basically the PK cloud be an IDENTITY used for lookups (it's unique, small, fast in joins).
But the clustered index should be on the column(s) most used in queries (WHERE statements).
In the example I gave, CustomerCode is used a lot in searches (lookup text boxes in the application) but not used in joins (for that we have the Id).
But CustomerCode is not sequential and can generate a lot of fragmentation on the clustered index.
Is it worth creating the cluster index on CustomerCode and have maintenance plan to rebuild/reorganize the indexes over creating the cluster on the Id (the only advantage is not having RID Lookups but Key Lookups - not much of an advantage...)?

Thanks,
Pedro




If you need to work better, try working less...
Post #1370050
Posted Monday, October 8, 2012 2:46 PM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, October 1, 2014 2:30 PM
Points: 515, Visits: 1,138
opc.three (10/8/2012)
How wide is that customerCode and is that an oft-used column to increase selectivity?


Pardon my ignorance, not used to some technical terms in English, but what do you mean by "How wide is that customerCode"?

Thanks,
Pedro




If you need to work better, try working less...
Post #1370052
Posted Monday, October 8, 2012 2:58 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 7:06 PM
Points: 7,125, Visits: 12,720
PiMané (10/8/2012)
opc.three (10/8/2012)
How wide is that customerCode and is that an oft-used column to increase selectivity?


Pardon my ignorance, not used to some technical terms in English, but what do you mean by "How wide is that customerCode"?

Thanks,
Pedro

Sorry, I could have said that in a better way. I meant to ask what is the data type (CHAR, VARCHAR, NCHAR, NVARCHAR) and if it is of variable-length, how wide is the data on average?


__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
Post #1370062
Posted Monday, October 8, 2012 3:05 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 7:06 PM
Points: 7,125, Visits: 12,720
PiMané (10/8/2012)
But CustomerCode is not sequential and can generate a lot of fragmentation on the clustered index.

That might be OK. If you set your fill factor appropriately you can fend off fragmentation reasonably well and still control over-consumption of memory.

Is it worth creating the cluster index on CustomerCode and have maintenance plan to rebuild/reorganize the indexes over creating the cluster on the Id (the only advantage is not having RID Lookups but Key Lookups - not much of an advantage...)?

Again it will depend on the data access patterns. The only way to know is to test the various options with a representative workload on representative hardware. Another nice thing about clustered tables is you have some control over page splits whereas with heaps you have very little control over the creation of forwarding records.

Regardless of which clustering key you choose, a good maintenance routine to keep fragmentation and page density at optimal levels will be critical in any scenario.


__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
Post #1370070
Posted Tuesday, October 9, 2012 2:32 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, October 1, 2014 2:30 PM
Points: 515, Visits: 1,138
opc.three (10/8/2012)
PiMané (10/8/2012)
opc.three (10/8/2012)
How wide is that customerCode and is that an oft-used column to increase selectivity?


Pardon my ignorance, not used to some technical terms in English, but what do you mean by "How wide is that customerCode"?

Thanks,
Pedro

Sorry, I could have said that in a better way. I meant to ask what is the data type (CHAR, VARCHAR, NCHAR, NVARCHAR) and if it is of variable-length, how wide is the data on average?


the column is NVARCHAR(20) but mostly used always 10 chars = 2 letters + 8 numbers.
But this is just a case study since other our clients can have only numbers or only chars without any pattern.
That's why it's hard to determine if the column is sequential or not... probably on 80% is sequential, starts at 000001 and goes on... In this case the column would be a very good cluster index...
When we don't know how the clustered index column(s) will be filled is there any FILLFACTOR value appropriate or every case is a different one?

Thanks,
Pedro




If you need to work better, try working less...
Post #1370216
Posted Tuesday, October 9, 2012 6:04 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 7:06 PM
Points: 7,125, Visits: 12,720
Appropriate for all? No. Pick a spot, maybe 90, monitor fragmentation, and adjust up or down.

I would be careful using a variable length column in a PK, especially if their values are not static. Test, test, test.


__________________________________________________________________________________________________
There are no special teachers of virtue, because virtue is taught by the whole community. --Plato
Post #1370347
Posted Tuesday, October 9, 2012 7:16 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: Wednesday, October 1, 2014 2:30 PM
Points: 515, Visits: 1,138
opc.three (10/9/2012)
Appropriate for all? No. Pick a spot, maybe 90, monitor fragmentation, and adjust up or down.

Will do that... the problem is that our SW is installed on several customers with different work methodology, we can have customers with 100 rows per table, 1.000 or 10.000 and even 1.000.000.
But since the ones that complaint are the ones with large databases probably it's worth just looking into those cases..

I would be careful using a variable length column in a PK, especially if their values are not static. Test, test, test.

Our PK are usually always INT IDENTITY or GUID, never a NVARCHAR.
We only have indexes over columns with NVARCHAR but not PKs. All lookups, FKs, are made usings INTs or GUIDs

Thanks,
Pedro




If you need to work better, try working less...
Post #1370381
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse