TL;DR; Final query is at the bottom.
Every now and again (particularly when someone is having performance problems) I’ll get the question “When were the statistics last updated?” On top of that, I recently found out that our UPDATE STATISTICS job uses the RESAMPLE option that tells the command to update the statistics based on it’s most recent sample rate. Which of course led me to wonder What was the previous sample rate?
In the end, I did a little research and found this post on the PERSIST_SAMPLE_PERCENT option with a great query that is almost exactly what I needed. So I’m going to use it as a starting point. I recommend reading through the post as it’s quite good and has some great information about the option PERSIST_SAMPLE_PERCENT. Here’s the original query:
SELECT ss.stats_id, ss.name, filter_definition, last_updated, rows, rows_sampled, steps, unfiltered_rows, modification_counter, persisted_sample_percent, (rows_sampled * 100)/rows AS sample_percent FROM sys.stats ss INNER JOIN sys.stats_columns sc ON ss.stats_id = sc.stats_id AND ss.object_id = sc.object_id INNER JOIN sys.all_columns ac ON ac.column_id = sc.column_id AND ac.object_id = sc.object_id CROSS APPLY sys.dm_db_stats_properties(ss.object_id, ss.stats_id) shr WHERE ss.[object_id] = OBJECT_ID('[Sales].[SalesOrderHeaderBulk]') AND ac.name = 'OrderDate';
The first change I want to make is to get rid of the existing WHERE clause. I want to run this on all statistics, not just one. This also means I can get rid of the reference to sys.all_columns and sys.stats_columns. Next, I’m going to make all of the column references two-part names to be sure I’m not missing a reference to one of the tables I removed and because I think it’s a good best practice. Then, since I’m not pulling for a single statistic anymore I’d best pull the table name into the query (by referencing sys.objects), and since I don’t care about system objects I’m going to restrict this to user tables only (using is_ms_shipped). Last but not least I’m going to clean up the column order a little bit (given that my point here is when was it last updated and how many rows have been/will be sampled) and put an order by.
SELECT o.name AS object_name, ss.stats_id, ss.name as stat_name, ss.filter_definition, shr.last_updated, shr.persisted_sample_percent, (shr.rows_sampled * 100)/shr.rows AS sample_percent, shr.rows, shr.rows_sampled, shr.steps, shr.unfiltered_rows, shr.modification_counter FROM sys.stats ss INNER JOIN sys.objects o ON o.object_id = ss.object_id CROSS APPLY sys.dm_db_stats_properties(ss.object_id, ss.stats_id) shr WHERE o.is_ms_shipped = 0 ORDER BY o.name, ss.stats_id;