SQLServerCentral Article

SSC Clinic: Can Implementing "Optimize for Ad Hoc Queries" Boost Performance for the SQLServerCentral.com and Simple-Talk.Com SQL Servers?

,

Note: This is a continuation of my SSC Clinic series where I take a close look at the SQLServerCentral.com and Simple-Talk.com servers and see if I can boost their performance. All code examples have been tested in SQL Server 2008 R2 and higher.

With the introduction of the instance-level option “optimize for ad hoc workloads” in SQL Server 2008, DBAs have a tool to deal with a problem known as plan cache pollution, or plan cache bloat. It’s often caused when one-time use ad hoc queries are sent to SQL Server from Object-Relational Mapping (ORM) solutions, such as LINQ, NHibernate, or Entity Framework. The problem can prevent SQL Server from using its available memory optimally, potentially hurting performance.

How Does the Query Optimizer Deal With Ad Hoc Queries?

The best way to understand how ad hoc queries can cause plan cache bloat and hurt performance is with a brief explanation of how the Query Optimizer optimizes ad hoc queries and stores them in the plan cache using default instance-level options.

Generally speaking, when an ad hoc query is first executed by SQL Server, a query plan is created and stored in the plan cache. The next time the same ad hoc query is run, the Query Optimizer can use the existing execution plan stored in the plan cache instead of taking the time and resources to create a new execution plan. This way, SQL Server does not have to continuously create a new execution plan each time the ad hoc query needs to run, helping to boost SQL Server performance. Of course, execution plans take up memory in the plan cache, but as long as the execution plans are being reused, this is memory well used.

Unfortunately, you can’t assume that an ad hoc execution plan will be used more than once. Depending on the application creating them, it is quite possible that a great many ad hoc queries will never be reused. Is this a bad thing? It can be. If lots of ad hoc queries have execution plans stored in the plan cache, and are never reused, then a lot of plan cache memory is wasted storing one-time queries that could be better put to use by SQL Server’s data cache. This is the classic definition of plan cache bloat.

Here’s what is happening inside SQL Server. SQL Server has a dedicated amount of RAM devoted to the buffer pool. The buffer pool includes the data cache, the plan cache, and other smaller caches not related to this topic. Because the buffer pool is a limited resource, when the plan cache needs more RAM, it usually has to take it away from the buffer pool, which in most cases will result in a smaller data cache.

The data cache is where SQL Server stores the data pages it needs to perform work. When SQL Server needs to perform some sort of SELECT, INSERT, UPDATE, or DELETE, the pages with the affected rows need to be in the data cache. If they are in the data cache, the data is in RAM, so SQL Server can perform its task very quickly. If the pages are not in the data cache, SQL Server has to read the pages from disk into the data cache before performing its task. Disk IO is much slower than RAM, so moving the data can slow down performance.

In other words, a large data cache stores more data pages and reduces the need for data to be moved from disk to the data cache, which boosts overall SQL Server performance. If the data cache is smaller, then it is very possible that more disk IO will have to be performed, hurting performance. Generally speaking, the larger the data cache, the faster SQL Server runs.

If the plan cache needs RAM, it may take it away from the data cache, making it smaller, and potentially hurting performance. On the other hand, the plan cache is very useful because it prevents the need to recreate execution plans every time the same ad hoc query needs to run again. So there needs to be a balance between how much RAM from the buffer pool is allocated to the data and plan caches.

In a perfect world, SQL Server would automatically balance the needs of the data and plan caches so that performance is always optimal, but unfortunately, this is often not the case. There are several reasons for this. First, the plan cache has no maximum upper limit. Of course, SQL Server won’t let the plan cache take all the RAM away from the data cache. Based on an internal formula, if the plan cache size gets to about 75-80% (see this article for the exact formula) of SQL Server’s target memory (the total amount of dynamic memory SQL Server can consume), then SQL Server will reduce the number of execution plans stored in the plan cache. Essentially, it removes the least used plans. This frees up memory that then can be used by the data cache, if needed. On the other hand, if a server has a lot of RAM, say 64GB or higher, then the plan cache can become very big, and potentially suffer from plan cache bloat.

Now, you may be asking, if SQL Server has a formula for preventing the plan cache from getting too big and putting pressure on the data cache, what’s the big problem? The problem is that the formula is not very sophisticated, and it can leave one-time use ad hoc queries in the plan cache for a long time, causing plan cache bloat.

What Happens When “Optimize for Ad Hoc Workloads” is Turned On?

Now that you have a basic understanding of how one-time ad hoc queries can cause plan cache bloat, we can talk about a potential solution: turning on the instance-level option “optimize for ad hoc workloads”.

This option changes the plan caching behavior of ad hoc queries. Here’s what happens when this option is turned on. When an ad hoc query is submitted to the query optimizer for the first time, an execution plan is created and the query is executed. Normally, this also means that the full execution plan is cached in the plan cache. But with the “optimize for ad hoc workloads” option turned on, the entire execution plan is not stored in the plan cache when an ad hoc query is first executed. Instead, a compiled plan stub is stored. Essentially, this stub has just enough data for the Query Optimizer to recognize if the ad hoc query runs a second time, at which point the execution plan will be recreated and stored in the plan cache. They theory is if an ad hoc query has been used at least twice, that most likely it will be used again, so the execution plan now stays in the plan cache.

The net result is that if an ad hoc query is just run once, then only a small stub is stored in the data cache, using very little memory, instead of a full execution plan, which can potentially take a lot of memory. If your instance experiences a lot of these one-time ad hoc queries and is experiencing plan cache bloat, turning “optimize for ad hoc workloads” essentially makes the plan cache bloat problem go away. This frees up RAM that can now be allocated to the data cache, which generally results in less disk IO and better overall SQL Server performance.

How Do You Know If You Have Plan Cache Bloat?

Unfortunately, there is no single easy step you can take to determine if your server has plan cache bloat. It depends on the applications running against the databases on a particular SQL Server instance, and specifically, whether the applications create a lot of one-time use ad hoc queries.

Determining if your server is experiencing plan cache bloat takes several steps. One of the first steps is to run the following query, which tells you how many one-time use ad hoc queries are currently stored in your instance’s plan cache.

--Gives you the total number of one-time use ad hoc queries currently in the
--plan cache.
USE master
GO
SELECT  SUM(CASE WHEN usecounts = 1 THEN 1
                 ELSE 0
            END) AS [Adhoc Plans Use Count of 1]
FROM    sys.dm_exec_cached_plans
WHERE   objtype = 'Adhoc'
GROUP BY objtype;
GO

If you prefer more detail, the following query will list, row-by-row, all the one-time use ad hoc queries currently in the plan cache. The number given by the above query and the number of rows from the following query should be almost the same. They will never be exactly the same because counts of one-time use ad hoc queries are constantly changing.

--List all the one-time ad hoc queries, row by row, currently in the plan 
--cache.
USE master
GO
SELECT  usecounts ,
        size_in_bytes ,
        cacheobjtype ,
        objtype
FROM    sys.dm_exec_cached_plans
WHERE   objtype = 'Adhoc'
        AND usecounts = 1;
GO

While the above two queries tell you how many one-time use ad hoc queries are in the plan cache, the following query will tell you how much memory is being used by them. This is often more useful than knowing the exact number of them.

--Tells you how much memory is being used by the one-time use ad hoc queries
USE master
GO
SELECT  SUM(CAST(( CASE WHEN usecounts = 1 THEN size_in_bytes
                        ELSE 0
                   END ) AS DECIMAL(18, 2))) / 1024 / 1024 AS [Total MBs Used by Adhoc Plans With Use Count of 1]
FROM    sys.dm_exec_cached_plans
WHERE   objtype = 'Adhoc'
GROUP BY objtype;
GO

Note: Because the data returned by these queries changes rapidly and constantly, it is best to run them regularly over time and store the results to create a data baseline. Later in this article, I will do this using SQL Monitor 3.1, which not only allows you to create custom metrics, like you see above, but also allows you to store and graphically display the baseline data with ease.

Before I go any further, it is very important that you thoroughly understand the results you see from the above queries. This is rarely explained in articles I have read on this topic. The crucial point is that while the above queries do provide you with a count of one-time use ad hoc queries, or their total memory use, the results are only for the fraction of a second when the queries ran.

In practice, this means that although a query may be identified as a one-time use ad hoc query at a specific point, it may not continue to be a one-time use ad hoc query. After you have counted one-time use ad hoc queries, one, some, even most of them may be run again. In addition, if the plan cache is coming under memory pressure, and SQL Server begins to remove one-time use ad hoc queries on its own, the results will be different the next time you run the query.

Because the number of one-time use ad hoc queries, and the memory they occupy, is changing all the time, counting them at a particular point in time is not a great indication of how many one-time use ad hoc queries there really are in the plan cache. On the other hand, it does give an indication. Broadly speaking, if the number of rows returned and memory used is high, this is an indication that your instance may be experiencing plan cache bloat.

For example, an instance with 25,000 one-time use ad hoc queries that use 690 MB of RAM may have plan cache bloat. If your instance only has 3,400 one-time use ad hoc queries and they use only 48 MB of RAM, your instance may be fine.

While high numbers from the above queries are a fairly good indicator, the only way to really know if your server has plan cache bloat, and whether turning on “optimize for ad hoc workloads” can fix the problem, is to test the option yourself.

To show you how you might go about this, I will spend the rest of this article testing out the “optimize for ad hoc workloads” option on the SQLServerCentral.com and Simple-Talk.com servers. Depending on the results – specifically, by taking baseline readings before and after the change –I can determine if changing the setting is a good idea.

Testing “Optimize for Ad Hoc Workloads” in the Real World

SQLServerCentral.com and Simple-Talk.com run on a two-node active/active cluster. Each runs on its own instance. Both servers run multiple applications and databases. Some of the applications were written in-house and others are third-party applications. To learn more about these servers, visit monitor.red-gate.com, where you can watch their live activity using SQL Monitor 3.1.

I am not the full-time DBA for these servers, so my first step was to find out if the “optimize for ad hoc workloads” setting was already turned for each instance. To do so, I ran the following query:

USE master
GO 
SELECT  CASE value_in_use
          WHEN 0 THEN 'Optimize for Ad Hoc Workloads is Turned Off'
          WHEN 1 THEN 'Optimize for Ad Hoc Workloads is Turned On'
        END AS [Optimize for Ad Hoc Workloads Current Status]
FROM    sys.configurations
WHERE   name = 'optimize for ad hoc workloads';
GO

Both SQLServerCentral.com and Simple-Talk.com had “optimize for ad hoc workloads” turned off. This is the default setting for new SQL Server installs. Once I knew both servers had “optimize for ad hoc workloads” turned off, I was ready to begin my testing.

In order to do a proper test, I needed baselines from before and after the change. There is no perfect way to perform such a test on a production server, but I performed a 4 hour baseline (8:00 AM to 12 noon) on one day, and took the same 4 hour baseline on the following day during the same time. I wanted to leave plenty of time between the baselines because it takes time for the various ad hoc queries that run on each instance to occur.

The test was not completely controlled, as the load on the servers change regularly—Tuesdays typically get more page views from the websites than Mondays, for example. Nonetheless, this test is valid for determining if using “optimize for ad hoc workloads” is a good choice. I took two baseline metrics (discussed a little later), and used SQL Monitor 3.1 to gather the data. I chose SQL Monitor because it allows me to easily create custom metrics to gather baseline data and graph it out over time.

After capturing the initial benchmark data, I changed the “optimize for ad hoc workloads” setting for each server using this code:

--Turn on the optimize for ad hoc workloads setting

--Turn on the optimize for ad hoc workloads setting
sp_CONFIGURE 'show advanced options', 1 
GO
RECONFIGURE;                             
GO 
sp_configure 'optimize for ad hoc workloads', 1 
GO
RECONFIGURE;
GO

Only new queries are affected by a change to the “optimize for ad hoc workloads” setting. To speed my testing along, I ran DBCC FREEPROCCACHE after I turned “optimize for ad hoc workloads” on, which cleared out the plan cache.

Some of you are thinking, doesn’t Brad know better than to run DBCC FREEPROCACHE in production? You’re right, running DBCC FREEPROCCACHE on production isn’t wise. It clears out the plan cache, so all queries have to have new execution plans created for them, and the SQL Server instance takes a performance hit. But, I did it under controlled conditions, and the only noticeable performance hit lasted about a minute when CPU and some other activity spiked. In addition, neither instance has any significant performance issues, and both have plenty of hardware overhead, so I went ahead and broke a rule of thumb. But I don’t recommend you try this at home!

Next I had to wait 24 hours. I assumed this would give virtually all the queries that normally run on these servers time to run after I cleared the cache. I then took another 4 hour baseline of a single metric. The results are below.

Test Results

If turning “optimize for ad hoc workloads” on is a benefit to the SQL Server instance, we should expect the number of one-time use ad hoc queries that are turned into stubs change from 0 to a high number. If the number is, say, 500 or more, then we can safely assume that the instance had plan cache bloat and that turning on “optimize for ad hoc workloads” was a good choice. If it’s less than 100, turning on “optimize for ad hoc workloads” probably didn’t help. 100 and 500 are rough estimates, of course. You will have to determine for yourself what number of stubs is important. In many cases, the results you get back are obvious, as they were in my testing.

Number of Ad Hoc Queries Running Only 1 Time

To determine if turning on “optimize for ad hoc workloads” is a good thing, I used three separate metrics, two for the “before” baseline, and one for the “after” baseline.

The first metric is one we have already seen. It counts the total number of one-time use ad hoc queries. I used the following code to create a custom metric in SQL Monitor so that the query was run every 60 seconds and the data stored so I could view it graphically for my benchmarking.

--Gives you the total number of one-time use ad hoc queries currently in the --plan cache.
USE master
GO
SELECT  SUM(CASE WHEN usecounts = 1 THEN 1
                 ELSE 0
            END) AS [Adhoc Plans Use Count of 1]
FROM    sys.dm_exec_cached_plans
WHERE   objtype = 'Adhoc'
GROUP BY objtype;
GO

This metric is best used before making the change, as it gives you a feel for how many one-time use ad hoc queries there are. While you can capture the same metric after the change, which I did, the data returned is often not relevant, because the load on the server may not be the same as it was during the initial benchmark. This was indeed the case when I viewed the “after” benchmark for this metric, so I have not included the “after” data in this article. There is a much more useful metric for success or failure, which I will discuss shortly.

Amount of Memory Used by Ad Hoc Queries Running Only 1 Time

We have also seen this metric before. It measures the total amount of memory in the plan cache currently occupied by one-time use ad hoc queries. Again, I created a custom metric in SQL Monitor so that the query runs every 60 seconds and stores the data for graphing.

--Tells you how much memory is being used by the one-time use ad hoc queries
USE master
GO
SELECT  SUM(CAST(( CASE WHEN usecounts = 1 THEN size_in_bytes
                        ELSE 0
                   END ) AS DECIMAL(18, 2))) / 1024 / 1024 AS [Total MBs Used by Adhoc Plans With Use Count of 1]
FROM    sys.dm_exec_cached_plans
WHERE   objtype = 'Adhoc'
GROUP BY objtype;
GO

This metric is best used before making the change, as it gives you a feel for how much memory is used by one-time use ad hoc queries. Again, I did capture this metric after the change, but because of the difference in load, the results weren’t relevant and the data is not included here.

Number of Ad Hoc Stubs Created When “optimize for ad hoc workloads” Is Turned On

This query is new. It counts the number of one-time use ad hoc stubs that are created, and is the best indicator of how successful turning on “optimize for ad hoc workloads” is. This benchmark can only be collected after turning on “optimize for ad hoc workloads” (it will return 0 before “optimize for ad hoc workloads” is turned on). As with the two previous metrics, I created a custom metric in SQL Monitor so that the query runs every 60 seconds and stores the data.

--Number of Ad Hoc Queries that are stubs
--If using SQL Server 2008 RTM, use "Compiled Plan Stu" instead of "Compiled --Plan Stub" below. This "bug" was fixed in SP1.
SELECT  SUM(CASE WHEN usecounts = 1 THEN 1
                 ELSE 0
            END) AS [Adhoc Plans Use Count of 1]
FROM    sys.dm_exec_cached_plans
WHERE   objtype = 'Adhoc' and cacheobjtype = 'Compiled Plan Stub'
GROUP BY objtype;
GO

Now that we understand the metrics, it’s time to see how each server “reacted”.

SQLServerCentral.com Analysis

The first server I tested was SQLServerCentral.com. The first two benchmark graphs below, which were created using SQL Monitor, show metrics taken before “optimize for ad hoc workloads” was turned on.

Figure 1: Number of one-time use ad hoc queries over a four-hour period.

As you can see from figure 1 above, the mean number of one-time use ad hoc queries was 3,242.2 over the four-hour benchmark period. The number varied over time, but was more or less the same.

Figure 2: Amount of memory used by one-time use ad hoc queries over a four-hour period.

The mean amount of memory used over the same four-hour time period by one-time use ad hoc queries was 303.7 MB. This seems like a big number, with the potential for plan cache bloat occurring. But is it…?

The next benchmark is the number of ad hoc stubs created after “optimize for ad hoc workloads” was turned on. This is the most important number to watch. If it’s low, turning on “optimize for ad hoc workloads” was not very helpful, maybe even harmful to performance. If it’s high, “optimize for ad hoc workloads” is working as designed and plan bloat will go away. So let’s examine the results.

Figure 3: Number of one-time use ad hoc query stubs after turning on “optimize for ad hoc workloads”.

Are you as surprised as I was by this result? The mean number of ad hoc stubs over the four-hour benchmark was only 21.1, a very low number. In other words, turning on “optimize for ad hoc workloads” for this server didn’t help.

What can explain this? It goes back to what I said earlier, that measuring the number of one-time use ad hoc queries and the memory associated with them gives a one-time snapshot. What the full results tell us is that most of the one-time use ad hoc queries we identified were used again, and that the applications are continuously producing more ad hoc queries on a regular basis. This means that the numbers we saw in the first two graphs weren’t very useful in telling us if the instance had plan cache bloat. As you can see, it’s hard to determine if your server has plan cache bloat until you perform an actual test.

Based on this test, I turned off “optimize for ad hoc workloads” on this server. It doesn’t have plan cache bloat, so leaving the option on may hurt performance (albeit not much), because it has to create execution plans twice for each ad hoc query, which isn’t necessary in most cases.

Simple-Talk.Com Analysis

I tested the Simple-Talk.com server next. The first two benchmark graphs below show metrics taken before “optimize for ad hoc workloads” was turned on, just as I did with the SQLServerCentral.com server.

Figure 4: Number of one-time use ad hoc queries over a four-hour period.

As you can see, the mean number of one-time use ad hoc queries was 14,678.3 over the four-hour benchmark period. This is a much greater number than we saw with the SQLServerCentral.com server. In addition, the graph is almost a straight line, unlike the up and down line we saw with the SQLServerCentral.com server. Obviously, this server has substantially different one-time use ad-hoc query activity than the SQLServerCentral.com server.

Figure 5: Amount of memory used by one-time use ad hoc queries over a four-hour period.

The radical difference between the two servers is also indicated in figure 5, which shows that over the benchmark time, a mean of 1,613 MB of RAM was used for the plan cache, again much higher than for SQLServerCentral.com. When you compare the two sets of benchmarks, what is your guess about plan cache bloat on this server?

Figure 6: Number of one-time use ad hoc query stubs after turning on “optimize for ad hoc workloads”.

If you guessed there was plan cache bloat, you were right. Over the benchmark period, the mean number of one-time use ad hoc queries that are now stubs was 13,387.1, a huge number. Unfortunately, there is no easy way to calculate the exact amount of RAM you will be saving in the plan cache from the use of these stubs, but you can be assured it is a substantial amount, a big saving on the 1,612 MB that was previously used.

Some of you may be wondering why I said it is hard to calculate how much plan cache RAM will be saved. It’s because the server is in production and is constantly changing, making it hard to track down the RAM savings. If the environment was in a steady state before and after the change to “optimize for ad hoc workloads”, then I could check the amount of RAM used by the plan cache before and after the change using a counter, such as SQL Server: Memory Manager: SQL Cache Memory (KB), or by using the sys.dm_os_memory_clerks DMV. But since the production system is changing all the time, it is hard to get an accurate measurement.

Based on this test, I have left “optimize for ad hoc workloads” on for this server.

Summary

As we have seen, the “optimize for ad hoc workloads” instance-level option can have little or no effect on performance, it can even hurt performance, or it can eliminate the plan cache bloat problem, helping to boost overall SQL Server performance. The only way you can really know for sure is to test it for yourself. What really made this testing easy for me was using of SQL Monitor 3.1 to create custom metrics, store the data, and display the results.

If you are not using “optimize for ad hoc workloads” for your instances, consider doing so. Perform a benchmark like I did for the first two metrics, and if the numbers are high, consider turning on “optimize for ad hoc workloads” and testing it. If the numbers are relatively low, as was the case with the SQLServerCentral.com server, you might decide not to waste your time performing tests. In any event, you now have a new tool that you can add to your toolbox of methods to help you optimize the performance of your SQL Server instances.

Rate

4.7 (10)

You rated this post out of 5. Change rating

Share

Share

Rate

4.7 (10)

You rated this post out of 5. Change rating