I recently upgraded 20 SQL servers from 2012 Standard Edition to 2016 Standard Edition. The database sizes ranged from a few GB to 4 TB. The number of rows in the largest tables were > 1 billion. The performance of the largest databases degraded after the upgrade. After extensive troubleshooting, involving the Microsoft Black Belt Team, I came up with a set of recommended server and database settings for SQL Server 2016 large databases. I provide the script MSSQL_Health_Check.SQL that checks these settings. I hope the script might save other people days or weeks of troubleshooting.
One of the key performance issues when upgrading from SQL Server 2012 to higher versions is a new database setting: AUTO_UPDATE_STATISTICS. I observed that auto update stats use a very low sampling rate (< 1%) with very large tables (> 1 billion rows). When the sample rate is very low, the estimated cardinality may not represent the cardinality of the entire table, and query plans become inefficient.
For large databases, do not use auto update statistics. Update statistics via maintenance jobs instead. I recommend the following maintenance jobs:
- Update statistics on work days with at least 10% sampling (100% for filtered indexes)
- Update statistics every Saturday with fullscan
Obviously, scheduling may vary depending on your database size and load patterns.
I provide the stored procedure UpdateStatistics in the script UpdateStatistics.SQL. To use the stored procedure, create a maintenance plan with a SQL task and enter:
The UpdateStatistics stored procedure will update statistics for all databases like this:
- Sets the database setting AUTO_UPDATE_STATISTICS to OFF
- Updates statistics with sampling rate 100% for filtered indexes. If you don't specify a sampling rate of 100% for stats on filtered indexes, they are updated automatically by SQL Server 2016
- Updates statistics for other indexes with sampling rate 10%. You can change the sample rate by setting parameter @sample_percent_for_update
You can limit the update to a single database by setting the parameter @table_filter.
I recommend this maintenance plan:
- Rebuild indexes with fragmentation > 30% followed by reorganize indexes with fragmentation > 15% every Saturday
Server and Database Settings
This section presents the server and database settings that I recommend for SQL Server 2016 large databases based on my observations. You can use the script MSSQL_Health_Check.SQL to check that your database is in accordance with these best practices. I see the list as helpful, but not exhaustive. The Microsoft Tiger Team provides more detailed health check scripts, for example this one: https://raw.githubusercontent.com/Microsoft/tigertoolbox/master/BPCheck/Check_BP_Servers.sql
Lock Pages in Memory
SQL Server service account must not have the ‘lock pages in memory’ privilege on servers that host other mission-critical applications. The locked pages in memory pros:
- SQL Server working set (BPOOL) cannot be paged by windows even when there is system wide memory pressure.
Locked pages in memory cons:
- Operating system will starve for memory when there is system wide memory pressure.
- OS has to completely rely on SQL Server to respond to low memory notification and scale down its memory usage.
- SQL Server may not respond fast enough to low memory condition at system level because OS is already starving for memory.
- LPIM prevents only the BPOOL from paging, Non-Bpool components can still be paged and we have critical portions of SQL Server memory like thread stack, SQL Server Images/DLL’s in Non-Bpool which can still be paged by OS.
Maximum Server Memory
The maximum memory allocated for SQL server must leave sufficient memory for the host OS. Calculate the max. memory for SQL Server like this. Take the total memory available and subtract these amounts::
- 1 GB for OS
- 4 GB for OS the first 16 GB memory
- 1 GB for OS for every additional 8 GB memory
This would mean a 32GB server would leave 7GB (1 + 4 (16GB) + 1 (8GB to 24GB) + 1 (8GB to 32GB)) for the OS and use the rest of SQL Server. Calculate your server memory based on the total physical memory.
Set Traceflags 1117, 1118, 2371, 4199
These flags are actually ignored in SQL Server 2016 or higher:
- TF1117 - When growing a data file grow all files at the same time so they remain the same size, reducing allocation contention points.
- TF1118 - When doing allocations for user tables always allocate full extents. Reducing contention of mixed extent allocations
TF2371 - Alters the behavior of auto update statistics so that it triggers at a lower percentage threshold. This flag must be set if auto update stats is enabled (which it should not be for large databases).
TF4199 - Enables many other query optimizer fixes. See the full list here: https://support.microsoft.com/en-us/help/974006/sql-server-query-optimizer-hotfix-trace-flag-4199-servicing-model
Optimize for Ad Hoc Workloads
The server instance option, 'optimize for ad hoc workloads', must be set to 1. This is used to improve the efficiency of the plan cache for workloads that contain many single use ad hoc batches. When this option is set to 1, the Database Engine stores a small compiled plan stub in the plan cache when a batch is compiled for the first time, instead of the full compiled plan.
This helps to relieve memory pressure by not allowing the plan cache to become filled with compiled plans that are not reused.
The database collation must be equal to the server collation
This is to avoid conversion overhead.
The database compatibility level must be the highest supported by the server product version
Get the latest and greatest version to get align with Microsoft support and external consultants.
Check that the database has been backed up within the last 7 days
Has nothing to do with performance, but you really want to know if backups have not been performed. Especially if you are making changes as part of your troubleshooting!
Log file must auto grow by fixed size
When using the default setting 10% auto growth for the log file, the auto-growth amount would get bigger as the database got larger. I therefore recommend a numerical growth rate so that the log file disk does not run full. I use 1 GB.
The tempdb database should span more than 1 file if there are more than one CPU cores, but no more files than the number of CPU cores. The tempdb database should not be placed on the same drive as the system databases (master, model, msdb).
Your database volumes must have at least 25% free space. The database files should not be placed on OS disk.
Database option AUTO_CLOSE must be OFF.
When AUTO_CLOSE is set ON, it causes performance degradation on heavily used databases by increasing overhead of opening and closing the database after each connection. Additionally, when it is ON, it also flushes (removes) the procedure cache after each connection. There are very rare scenarios when you would need this particular setting on, otherwise in most of the cases, it is a good idea to leave it OFF.
Database option AUTO_CREATE_STATISTICS must be ON
The settings AUTO_CREATE_STATISTICS must be ON. The statistics are automatically created for each new index. The statistics are automatically created for non-indexed columns that are used in your queries.
Not to be confused with AUTO_UPDATE_STATISTICS, which must be OFF.
Database option AUTO_SHRINK must be OFF
Auto shrinking the database comes with a performance hit. If you know that the space that you are reclaiming will not be needed in the future, you can reclaim the space by manually shrinking the database.
Statistics Sample Rate Should Not be Lower than 9.9
Assuming that you have configured maintenance plans to update stats with 10% sampling. A statistics sample rate below 9.9 could indicate that your maintenance plans are not configured correctly.
Statistics Must Be Up to Date
Assuming that you take my advice to run maintenance plans every day, statistics should not be older than a day (2 days on Mondays, according to my recommended schedule).
Indexes and heaps must not have > 30% fragmentation
Again, assuming that you take me up on the advice to rebuild and reorganize indexes every week.