|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Yesterday @ 5:49 PM
Points: 37,671,
Visits: 29,925
|
|
ScottPletcher (7/6/2012)
An integer column defined as identity (a very good and popular choice for clustered index) is 4 bytes. It's certainly popular, but most often not good. The no-thought identity as clustered key likely causes more performance problems than any other single thing.
We'll have to agree to disagree on that matter. I hold to the school of thought that the clustered index should first organise the data (narrow, unchanging, ever-increasing, unique). If it can also support queries that's a bonus.
There's only one clustered index, multiple nonclustered indexes, clustered index to organise the table, nonclusters to support the queries.
Gail Shaw Microsoft Certified Master: SQL Server 2008, MVP SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
We walk in the dark places no others will enter We stand on the bridge and no one may pass
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 2:37 PM
Points: 1,319,
Visits: 1,766
|
|
GilaMonster (7/6/2012) There's only one clustered index, multiple nonclustered indexes, clustered index to organise the table, nonclusters to support the queries.
Every column added to a query invalidates the existing covering index and will revert back to a table scan.
How do you keep up with every single-column change to thousands of queries for thousands of databases and tens of thousands, or even millions, of tables, to keep all your non-clus indexes as truly covering?
Just not possible, practical or worthwhile.
The most important performance aspect of a table is determining the correct clustered index.
SQL DBA,SQL Server MVP('07, '08, '09) One man with courage makes a majority. Andrew Jackson
|
|
|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Yesterday @ 5:49 PM
Points: 37,671,
Visits: 29,925
|
|
ScottPletcher (7/6/2012)
GilaMonster (7/6/2012)Every column added to a query invalidates the existing covering index and will revert back to a table scan. Not necessarily a table scan. Maybe a seek + lookup, maybe index intersection, maybe an index scan. SQL has lots of options for indexes. How do you keep up with every single-column change to thousands of queries for thousands of databases and tens of thousands, or even millions, of tables, to keep all your non-clus indexes as truly covering?
I don't. I make covering indexes for critical queries, ensure that not-so-critical queries perform within requirements and make sure that new developments are correctly performance tested and that the developers have made any necessary index creations or modifications before the change are deployed to production.
How do you keep up with changes to all the queries across thousands of databases and tens of thousands, or even millions tables to keep all your nonclustered indexes up to date and useful for the queries that don't use the clustered index, seeing as I have very seldom seen tables with one and only one access path? (and the cluster can only support one access path)
The most important performance aspect of a table is determining the correct clustered index.
Absolutely, and what I want is one that is narrow - doesn't make nonclustered indexes larger than they need to be -, unique - doesn't require a uniquifier column - , unchanging - that isn't going to require that every nonclustered index on the table be updated when the clustering key value is changed - and ever-increasing - that isn't going to cause huge fragmentation due to inserts.
Gail Shaw Microsoft Certified Master: SQL Server 2008, MVP SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
We walk in the dark places no others will enter We stand on the bridge and no one may pass
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 2:37 PM
Points: 1,319,
Visits: 1,766
|
|
The most important performance aspect of a table is determining the correct clustered index.
Absolutely, and what I want is one that is narrow - doesn't make nonclustered indexes larger than they need to be -, unique - doesn't require a uniquifier column - , unchanging - that isn't going to require that every nonclustered index on the table be updated when the clustering key value is changed - and ever-increasing - that isn't going to cause huge fragmentation due to inserts.
The same rote quote, and, yes, those are good general rules, but not enough by themselves.
Please, those of you learning about clustered indexes, look beyond the no-thought approach.
Instead, consider the actual requirements of your specific table before determining the clustered index key(s) for that table.
You'll save yourself huge performance problems and the extra work required to re-index the table properly later.
SQL DBA,SQL Server MVP('07, '08, '09) One man with courage makes a majority. Andrew Jackson
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 2:05 PM
Points: 2,013,
Visits: 1,566
|
|
As a matter of fact, you can’t avoid the scans (table / index) in real life or practical scenarios but you can minimize those and that’s what Gail is trying to explain here.
Every columns that you add to query as predicate may or may not be covering at the moment. It needs not to be covering especially in ad-hoc query environments it’s very common. If a query that executes rarely, I don’t see any reason to worry much on it unless an explicit weight / priority is assign to that query.
Also, you would not like to include all the columns in non-clustered index just to make it covering. Would you?
~Dev~
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 2:37 PM
Points: 1,319,
Visits: 1,766
|
|
Dev (7/8/2012) As a matter of fact, you can’t avoid the scans (table / index) in real life or practical scenarios but you can minimize those and that’s what Gail is trying to explain here.
Every columns that you add to query as predicate may or may not be covering at the moment. It needs not to be covering especially in ad-hoc query environments it’s very common. If a query that executes rarely, I don’t see any reason to worry much on it unless an explicit weight / priority is assign to that query.
Also, you would not like to include all the columns in non-clustered index just to make it covering. Would you?
Not at all. That's why I clearly stated, the (vast) majority of the queries should be using the clustered index to limit rows.
What Gail stated is:
clustered index to organise the table, nonclusters to support the queries.
I make covering indexes for critical queries
If nonclus are to support the queries, they need to be: 1) covering indexes OR 2) select only a very small % of the total rows
Since you can't directly control #2, aren't you then forced to make covering indexes?
If the query restricts to a minimum range of rows based on the clustered index, I don't have to make any adjustments then even if they select/compare every column in the table.
SQL DBA,SQL Server MVP('07, '08, '09) One man with courage makes a majority. Andrew Jackson
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 3:43 AM
Points: 1,257,
Visits: 4,255
|
|
| What does that have to do with the clustered index key, though? You certainly can't cover every possible query with the clustered index (unless you have an incredibly simple database), so you have to have nonclustered ones--and if you have them, then having the clustering key as small as possible (since it needs to be repeated in every NC index) is surely a good idea, which is pretty much what Gail said?
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 2:05 PM
Points: 2,013,
Visits: 1,566
|
|
Let me put this way; I have a table XYZ and an ID (Primary, Clustered Index). I typically don’t query on ID column but it’s used in many queries for joins. I typically search based on Description (Non-Clustered index).
Now in this case, assume I don’t have Non-Clustered index in place and I query based on Description column, it will scan the cluster. Will it be a good plan? If I have Non-Clustered index (on description) in place, based on selectivity optimizer may pick the index. Isn’t it a good plan?
In continuation to the same, if the query selects few non-key columns as well, I may include couple of columns (mostly queried) in it. If this index can satisfy 60-80% queries on that table I will consider it a good index.
The major fact, we can have only one Clustered Index on a table (ignoring workaround with INCLUDE). ‘select only a very small % of the total rows’ is applicable to Clustered Index as well for seek operation else it will scan the cluster.
~Dev~
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Yesterday @ 2:37 PM
Points: 1,319,
Visits: 1,766
|
|
Dev (7/9/2012)‘select only a very small % of the total rows’ is applicable to Clustered Index as well for seek operation else it will scan the cluster.
False, of course. If, for example, you have a table with two years' worth of data, 2010 and 2011, clustered on date, and you specify >= '20110101', SQL will read only the 2011 data.
That is another main reason you should leverage the clus index as much as possible.
SQL DBA,SQL Server MVP('07, '08, '09) One man with courage makes a majority. Andrew Jackson
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 2:05 PM
Points: 2,013,
Visits: 1,566
|
|
ScottPletcher (7/9/2012)
Dev (7/9/2012)‘select only a very small % of the total rows’ is applicable to Clustered Index as well for seek operation else it will scan the cluster.
False, of course. If, for example, you have a table with two years' worth of data, 2010 and 2011, clustered on date, and you specify >= '20110101', SQL will read only the 2011 data. That is another main reason you should leverage the clus index as much as possible.
It’s just an optimization added to SCAN operation but it won’t be good anyways if the data is skewed and ‘>= 20110101’ returns 90% of rows.
~Dev~
|
|
|
|