Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Including additional columns in a unique clustered index Expand / Collapse
Author
Message
Posted Monday, March 4, 2013 10:58 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, March 4, 2013 5:19 PM
Points: 15, Visits: 86
I am creating an indexed view from a business unit table. The goal of the view is to pre-filter the unit list to only those supported by our division. The business hierarchy goes from unit to district to region, with unit # being unique on the table.

However, we rarely ever use unit in our applications and instead focus on the district number (as per business requirements.) One of the most common queries performed is to get a distinct list of districts based upon region selections. Only occasionally would the unit # be included too.

Should the region and district columns be included in the unique clustered index of the view? Or should they be limited to a separate nonclustered index? I'm leaning toward the latter, but if anyone has a good reason to include them in the clustered index, I'd love to hear it.
Post #1426373
Posted Monday, March 4, 2013 12:42 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Friday, August 29, 2014 2:25 PM
Points: 2,044, Visits: 3,059
Hmm. In a normalized design, I wouldn't expect the region to be in the unit table, only the district.

Instead, I'd expect the distinct list of districts w/i a region to come from a districts table. So you don't have a separate districts/district-level table?


SQL DBA,SQL Server MVP('07, '08, '09)
"In America, every man is innocent until proven broke!" Brant Parker
Post #1426417
Posted Monday, March 4, 2013 1:29 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, March 4, 2013 5:19 PM
Points: 15, Visits: 86
Good observation, however, the table design was not mine and probably existed before I was even hired. It is now part of our data warehouse and is not likely to be changed anytime soon (if ever.)

Please just accept that the table is what it is, warts and all.
Post #1426438
Posted Monday, March 4, 2013 3:03 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Friday, August 29, 2014 2:25 PM
Points: 2,044, Visits: 3,059
Ok.

At any rate, typically you'll want to cluster the table on the most common queries that can't be easily satisified by a nonclustered index you need anyway.

For example, if (nearly) every query specifies district, you should probably cluster the table on district first. Add the region if needed to make the clus key unique.

You might still want a nonclus index on region and district, to speed up the specific query you mentioned. But you don't want to cluster the whole table on region if that's not typically specified for most queries.

Btw, I'm still not clear on where "division" fits into this, so I've gone with more generic info.


SQL DBA,SQL Server MVP('07, '08, '09)
"In America, every man is innocent until proven broke!" Brant Parker
Post #1426479
Posted Monday, March 4, 2013 3:38 PM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 10:05 AM
Points: 42,829, Visits: 35,961
dembones79 (3/4/2013)
Should the region and district columns be included in the unique clustered index of the view?


Bear in mind that a clustered index implicitly includes all the columns in the table/view. For the key I'd specify the minimum of columns, just what is the unique portion. The rest of the columns are in the index too.
If it were a nonclustered index I'd still put just the unqiue columns in the key, the others would be include columns.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1426493
Posted Monday, March 4, 2013 4:00 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Friday, August 29, 2014 2:25 PM
Points: 2,044, Visits: 3,059
GilaMonster (3/4/2013)
dembones79 (3/4/2013)
Should the region and district columns be included in the unique clustered index of the view?


Bear in mind that a clustered index implicitly includes all the columns in the table/view. For the key I'd specify the minimum of columns, just what is the unique portion. The rest of the columns are in the index too.
If it were a nonclustered index I'd still put just the unqiue columns in the key, the others would be include columns.



If you (almost) always select by multiple columns, and depending on the selectivity of the columns, you might get significant gains from including another column in the clus index. Don't follow too simplistic rules of "(always) one column per clus index" at the expense of the genuine performance of your dbs.

Look at some of MS's system tables: some of their internal replication tables have 5 clus keys. I'm sure some would sorely disapprove of that, but it might be needed for best performance.


SQL DBA,SQL Server MVP('07, '08, '09)
"In America, every man is innocent until proven broke!" Brant Parker
Post #1426503
Posted Monday, March 4, 2013 4:16 PM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 10:05 AM
Points: 42,829, Visits: 35,961
ScottPletcher (3/4/2013)
GilaMonster (3/4/2013)
dembones79 (3/4/2013)
Should the region and district columns be included in the unique clustered index of the view?


Bear in mind that a clustered index implicitly includes all the columns in the table/view. For the key I'd specify the minimum of columns, just what is the unique portion. The rest of the columns are in the index too.
If it were a nonclustered index I'd still put just the unqiue columns in the key, the others would be include columns.



If you (almost) always select by multiple columns, and depending on the selectivity of the columns, you might get significant gains from including another column in the clus index. Don't follow too simplistic rules of "(always) one column per clus index" at the expense of the genuine performance of your dbs.


Adding another column to a unique index will allow duplicates to the columns that are supposed to be unique. That's not a performance issue, that's a data integrity issue.

If defining a set of columns as unique, that index should have just the unique columns defined in it, otherwise the unique constraint is not doing the job of enforcing uniqueness across a set of columns.

If you have a business design that says that email address column must be unique, then defining the unique index on email address, surname, status and last order date may be great for performance, but so much for the unique email address.

Look at some of MS's system tables: some of their internal replication tables have 5 clus keys.


Yeah, and many of them have horrific performance.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1426508
Posted Monday, March 4, 2013 4:26 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Friday, August 29, 2014 2:25 PM
Points: 2,044, Visits: 3,059
GilaMonster (3/4/2013)
ScottPletcher (3/4/2013)
GilaMonster (3/4/2013)
dembones79 (3/4/2013)
Should the region and district columns be included in the unique clustered index of the view?


Bear in mind that a clustered index implicitly includes all the columns in the table/view. For the key I'd specify the minimum of columns, just what is the unique portion. The rest of the columns are in the index too.
If it were a nonclustered index I'd still put just the unqiue columns in the key, the others would be include columns.



If you (almost) always select by multiple columns, and depending on the selectivity of the columns, you might get significant gains from including another column in the clus index. Don't follow too simplistic rules of "(always) one column per clus index" at the expense of the genuine performance of your dbs.


Adding another column to a unique index will allow duplicates to the columns that are supposed to be unique. That's not a performance issue, that's a data integrity issue.

If defining a set of columns as unique, that index should have just the unique columns defined in it, otherwise the unique constraint is not doing the job of enforcing uniqueness across a set of columns.

If you have a business design that says that email address column must be unique, then defining the unique index on email address, surname, status and last order date may be great for performance, but so much for the unique email address.

Look at some of MS's system tables: some of their internal replication tables have 5 clus keys.


Yeah, and many of them have horrific performance.



If you don't look up by the unique column, but by a non-unique column first and a unique column only sometimes, don't make the clus index key only the unique column. That will be a severe performance issue.

Again, rigid overly-simplistic rules such as those will give you a poorly performing database with excessive nonclustered indexes to cover all the queries the clus index can't because its keys were so poorly chosen.

MS has engineers with vastly more expertise at scale than any of us.

Sometimes multiple clus keys are needed on a table.

And since the clus index is the most important part of good query performance, you should consider it carefully and select the proper key column(s) to match your environment, not some pre-determined, overly-simplistic rule.


SQL DBA,SQL Server MVP('07, '08, '09)
"In America, every man is innocent until proven broke!" Brant Parker
Post #1426510
Posted Monday, March 4, 2013 4:34 PM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Today @ 5:35 PM
Points: 23,300, Visits: 32,052
ScottPletcher (3/4/2013)
GilaMonster (3/4/2013)
ScottPletcher (3/4/2013)
GilaMonster (3/4/2013)
dembones79 (3/4/2013)
Should the region and district columns be included in the unique clustered index of the view?


Bear in mind that a clustered index implicitly includes all the columns in the table/view. For the key I'd specify the minimum of columns, just what is the unique portion. The rest of the columns are in the index too.
If it were a nonclustered index I'd still put just the unqiue columns in the key, the others would be include columns.



If you (almost) always select by multiple columns, and depending on the selectivity of the columns, you might get significant gains from including another column in the clus index. Don't follow too simplistic rules of "(always) one column per clus index" at the expense of the genuine performance of your dbs.


Adding another column to a unique index will allow duplicates to the columns that are supposed to be unique. That's not a performance issue, that's a data integrity issue.

If defining a set of columns as unique, that index should have just the unique columns defined in it, otherwise the unique constraint is not doing the job of enforcing uniqueness across a set of columns.

If you have a business design that says that email address column must be unique, then defining the unique index on email address, surname, status and last order date may be great for performance, but so much for the unique email address.

Look at some of MS's system tables: some of their internal replication tables have 5 clus keys.


Yeah, and many of them have horrific performance.



If you don't look up by the unique column, but by a non-unique column first and a unique column only sometimes, don't make the clus index key only the unique column. That will be a severe performance issue.

Again, rigid overly-simplistic rules such as those will give you a poorly performing database with excessive nonclustered indexes to cover all the queries the clus index can't because its keys were so poorly chosen.

MS has engineers with vastly more expertise at scale than any of us.

Sometimes multiple clus keys are needed on a table.

And since the clus index is the most important part of good query performance, you should consider it carefully and select the proper key column(s) to match your environment, not some pre-determined, overly-simplistic rule.


I don't see Gail setting forth any overly-simplistic, predetermined rules for creating indexes either clustered or nonclustered. All I see is a basic recommendation about a unique index (clustered or nonclustered).




Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1426514
Posted Monday, March 4, 2013 4:57 PM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 10:05 AM
Points: 42,829, Visits: 35,961
ScottPletcher (3/4/2013)
Again, rigid overly-simplistic rules such as those will give you a poorly performing database with excessive nonclustered indexes to cover all the queries the clus index can't because its keys were so poorly chosen.


So 'keep the keys for a unique index to the columns that need to be enforced unique' is a 'rigid overly-simplistic rule'?

Maybe you want to read what I said again, that for a unique index (one that's designed for enforce uniqueness) you should keep the keys to the columns that need to be unique, otherwise you have a constraint that isn't constraining anything in a useful way.

Oh, and since I've never once seen a system where all queries against a table query on a single column or set of columns, no matter how you select the clustered index you are going to need nonclustered indexes (not that I said anything about how to select the clustered index keys in this thread). Unless of course you have that idealistic and perfect system where each and every table is only ever filtered or joined on a single column or set of columns. Never seen one of those in reality though.

MS has engineers with vastly more expertise at scale than any of us.


True, the CAT team (Mark, Lubor, Cindy and the rest are awesome). Pity those skilled engineers didn't work on the replication tables. Or on Sharepoint. Or on several other examples of terrible database design that came out of MS.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1426525
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse