Blog Post

Customizing Statistics Histogram in SQL Server 2019

,

The use of statistics in SQL Server is tightly embedded in the query optimizer and query processor. The creation and maintenance of statistics is usually handled by the SQL Server engine, though many DBAs and developers know that periodically we might need to update those statistics to ensure good performance of queries. SQL Server 2019 gives us new options.

The historical organization of statistics for a table is a 200 step histogram of values sampled from the data. This could be a sample of the entire dataset or a subset. For tables less than 8MB, the entire table is sampled. Above this, the proportion changes to a lower rate and reduce the resources required.

This means that sometimes we have less accuracy in the histogram than we would like.

A New DMF

In SQL Server 2019, we have a new DMF, sys.dm_exec_table_stats, that is designed to create a new statistics entry for your table. The parameters for this DMF are:

  • object_id – this is required. If you have just the table name, is object_id() to enter that, but you need the id of the table.
  • schema_id – also required. The schema_id of the schema for this table.
  • column_id – required. Column on which you are creating statistics
  • histogram_steps – not required, but defaults to 200, which defeats the purpose of this DMF. You can specify any value up to 1024

This means that you can create new statistics that include a more granular detail. You can see this in action with the DBCC SHOW_STATISTICS command and the WITH HISTOGRAM option. I ran this DMF with a value of 1024 on a large version of AdventureWorks and got these results. I am only showing the bottom of the results here.

2019-03-27 11_43_47-SQL Prompt - Insert results1.sql - Plato_SQL2017.AdventureWorks2012CS (PLATO_Ste

I didn’t get the full 1024 values, but I did get close here. As you can see, the histogram is significantly larger than the 200 step limit.

Using Larger Statistics Histograms in your Database

There is a great tutorial on how this works from the SQL Server Tiger Team. I’d encourage you to read this and experiment with your own data set and see what values are most useful. It seems for larger tables, the Tiger Team recommends 500 steps, so 1024 might be overkill for most of us.

Also, this is an April Fool’s joke, which you might have realized if you clicked on some of the links above. Hope you enjoyed this.

Original post (opens in new tab)
View comments in original post (opens in new tab)

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating