Statistics are the magic ingredient that helps the query optimizer create its best guess for generating an execution plan. Keeping statistics as accurate as possible will give the optimizer the information it uses to calculate estimated costs and cardinality, which enables it to properly allocate resources such as memory grants. In this blog I am not going to go into what statistics are and how they are used. There are many great resources out there to read, I am however going to tell you about a trace flag that can help you with keeping your large table statistics up to date. I will also explain why your statistics may not be as current as you would like.
This information will apply to those running SQL Server edition 2008 R2 SP1 up to and including SQL Server 2014. For those that are lucky enough to be running 2016 or higher (and in compatibility mode 130 or higher) you are in luck as this behavior is now enabled by default. In discussions lately, I have found that many database administrators are unaware it exists.
If you are using auto update statistics on your database, and this is a best practice, then you are already updating your statistics on a regular basis as your data changes. By that I mean, SQL Server will automatically trigger a process to update your statistics. This occurs based on threshold percentage value of change. The auto-update process runs when statistics are used by a query and when they may be out of date.
According to MSDN, the optimizer determines when statistics are out of date by counting the number of modifications since the last statistic update and comparing the number of row changes, insert, updates, deletes and merges that occur on your table. However, as you can imagine for larger tables to meet that percent change it could mean a significant amount of time, which can lead to subpar estimations. Imagine a table with 5 million rows, it would require 1 million rows to change before an auto update statistics to be triggered. This is especially true for those environments that do not routinely run an UPDATE STATISTICS with FULLSCAN maintenance job.
There is a trace flag available that will assist in keeping stats up-to-date in large tables. Currently, the algorithm that is used to automatically update statistics is 20% + 500 rows. Trace Flag 2371 changes this algorithm to one that utilizes a sliding scale as shown below. Using this trace flag will drastically increase the frequency of which statistics updates occur on your larger tables, which in turn give the optimizer much better estimates to work with.
Figure 1 Graph Source
Along with this setting, it is recommended to turn on the Asynchronous Auto Stats Update setting on databases which have large tables. I have a blog which talks about the difference between Asynchronous and Synchronous statistics. Because the update statistics process will be triggered more frequently, users may experience slowness or even timeouts while waiting for it to complete. By doing asynchronous statistics updates the users query will continue to run as normal. The query will use the old statistics while SQL Server updates the statistics in the background. Once complete, SQL Server will automatically make the new statistical values available to queries.
If you are in an environment that is not using SQL Server 2016 or higher and you have large tables, consider enabling this trace flag. It’s a very easy fix but does require a service restart.