Printed 2015/01/28 11:17AM

Tuning Tip: Identifying Overlapping Statistics

By Kendal Van Dyke, 2010/09/08

Statistics are used by SQL Server's query optimizer to help determine the most efficient execution plan for a query. When the option to automatically create statistics is enabled (which it is by default) SQL Server will create statistics on columns used in a query's predicate as necessary, which usually means when statistics don't already exist for the column in question.

Statistics are also created on the key columns of an index when the index is created. SQL Server understands the difference between auto created column statistics and index statistics and maintains both - you can see this by querying the sys.stats system view. As I found out firsthand not too long ago having both auto created statistics and index statistics on the same column caused the query optimizer to choose a different - and less than optimal - execution plan than when only the index statistics existed.

According to the MSDN article Statistics Used by the Query Optimizer in Microsoft SQL Server 2000 auto created statistics are automatically dropped over time if they are not used but that can take an undetermined amount of time. What if we're experiencing the kind of problem I previously wrote about? In most of the cases I've seen it makes sense to help SQL Server out by manually dropping the auto created statistics in favor of index statistics that exist for the same column.

Fortunately SQL Server contains everything we need to know to figure out when column statistics are overlapped by index statistics. The following query will identify overlapped\overlapping statistics and generate the statements you can use to drop the overlapped statistics. All the usual warnings apply here - although this has not caused any problems for me your mileage may vary so wield this with an appropriate degree of caution:

WITH    autostats ( object_id, stats_id, name, column_id )
AS ( SELECT   sys.stats.object_id ,
sys.stats.stats_id ,
FROM     sys.stats
INNER JOIN sys.stats_columns ON sys.stats.object_id = sys.stats_columns.object_id
AND sys.stats.stats_id = sys.stats_columns.stats_id
WHERE    sys.stats.auto_created = 1
AND sys.stats_columns.stats_column_id = 1
SELECT  OBJECT_NAME(sys.stats.object_id) AS [Table] ,
   AS [Column] ,
   AS [Overlapped] ,
   AS [Overlapping] ,
'DROP STATISTICS [' + OBJECT_SCHEMA_NAME(sys.stats.object_id)
'].[' + OBJECT_NAME(sys.stats.object_id) + '].['
+ + ']'
FROM    sys.stats
INNER JOIN sys.stats_columns ON sys.stats.object_id = sys.stats_columns.object_id
AND sys.stats.stats_id = sys.stats_columns.stats_id
INNER JOIN autostats ON sys.stats_columns.object_id = autostats.object_id
AND sys.stats_columns.column_id = autostats.column_id
INNER JOIN sys.columns ON sys.stats.object_id = sys.columns.object_id
AND sys.stats_columns.column_id = sys.columns.column_id
WHERE   sys.stats.auto_created = 0
AND sys.stats_columns.stats_column_id = 1
AND sys.stats_columns.stats_id != autostats.stats_id
AND OBJECTPROPERTY(sys.stats.object_id, 'IsMsShipped') = 0

For more information on statistics see Using Statistics to Improve Query Performance in SQL Server Books Online. I also recommend watching Introduction to SQL Server Statistics, a presentation my friend Andy Warren (Blog | Twitter) recently gave for the PASS Performance Virtual Chapter which is available in their Presentation Archive.

Copyright © 2002-2015 Simple Talk Publishing. All Rights Reserved. Privacy Policy. Terms of Use. Report Abuse.