Those who do know me, will know that I have been presenting a session this year titled “Natural Born Killers, performance issues to avoid” around the UK. The “Series” element of the title is there because I have decided to serialise the session for the benefit of those who have been unable to see it thus far.
The plan is to release one topic from the presentation each day of SQL Relay 2013. Today is the eighth and final day of SQL Relay, this leg is being held in the Microsoft offices in London. Unfortunately I’m not there today as I’m in Madrid for TechEd where I am representing Dell on their stand, but I’ve seen the schedule and it’s going to be a cracking day of SQL Server content.
Parameter sniffing is receiving a lot of bad press at the moment, but it is actually an optimisation process. In this respect you can liken it to parallelism, in some quarters this is seen as a bad thing too. I’m sure you have all come across someone who once had a problem so it was turned off. Repeat after me “Parameter sniffing is not necessarily a bad thing” whilst you appear to be saying random things to your colleagues also say “Parallelism isn’t a bad thing either”.
What is Parameter Sniffing?
For those of you that might not be aware of what parameter sniffing is let me define it for you:
Parameter sniffing is an optimisation process that stores the actual values of parameters with the query plan at compilation time.
So why is this such a big problem? To be honest most of the time it isn’t a problem, it’s only a problem if the data being used in your parameters vary wildly in their cardinality. In other words it all comes down to statistics. Not sure what statistics are? Check out Day 1 of this series.
How do I know when it will affect me?
This one is pretty tricky, you really need to know what you are looking for and this means knowing your environment. What you need to be doing is monitoring the IO and CPU usage of your queries really closely to see if suddenly things slow down and perform lots more IO. If they are really slow only some of the time then there’s a good chance you have a statistics skew in your data causing this problem.
What does this mean? This means you need to start baselining your environment. It so happens that I will be recording a webinar for Dell on baselining in July, so stay tuned for further details on that.
As I mentioned before, this is a problem caused by unequal amounts of data rather than anything bad SQL Server is doing. There is another way you can check to see what is happening by looking at the actual execution plan. This is something you should be testing when creating/deploying stored procedures with parameters.
This is a quick checklist that I recommend that you follow in these scenarios:
- Have a look at the statistics that will be used for your parameter. Are they even?
- Are your statistics being updated manually or automatically? See day 1 for the reason I am asking this.
- What happens to the plan when you cache the most frequently occurring value and then run the polar opposite?
- What happens to the plan when you cache the least frequently occurring value and then run the most frequent?
- Double check what your SLA is for this query.
- Ask your business if they would like a default value for this.
- Usually patching happens in the dead of night or weekends, who will compile the first plans after reboot? Do you need to safeguard against this?
- If required look at fixing the issue with one of the options that are coming up later.
Let’s see the problem in action
Now we know a little bit about the positive and negative sides of parameter sniffing here’s an example. Before we start, I cannot stress enough that you should run the following on a nonproduction server.
First off we’re going to create and populate a table in TempDB with some very skewed statistics. We’ll then create a parameterised stored procedure and run it a few times to see how this skew in statistics affects the plans that the optimizer has chosen for us.
USE TEMPDB; GO --Create a base table to hold our example data CREATE TABLE ParameterTable (ParameterTableID int identity(1,1) primary key clustered ,ParameterTableDate DATETIME DEFAULT GETDATE() ,ParameterTableParam VARCHAR(10) ,SySTableID int ,SysColumnsID int ) --Create some heavily biased data INSERT INTO ParameterTable SELECT TOP 100000 GETDATE(), CASE WHEN sc1.column_id = 10 and sc1.user_type_id = 48 AND sc2.object_id < 100 THEN 'Obscure' ELSE 'Common' END, sc1.object_id, sc1.column_id FROM Sys.tables st cross join sys.columns sc1 cross join sys.columns sc2; GO --Create an index to build statistics, this index is designed not to cover the full query CREATE INDEX Snifter on ParameterTable(ParameterTableParam); go --Build a stored procedure that uses a parameter CREATE PROCEDURE ParamSniffDemo (@ParamValue VARCHAR(10)) AS BEGIN SELECT ParameterTableID, ParameterTableDate, ParameterTableParam FROM ParameterTable PT WHERE ParameterTableParam = @ParamValue; END GO
At this point I want you to turn on Actual execution plans in whatever editor you are using to run this. I’m now going to run a command that I wouldn’t recommend you run unless you are running this on your own laptop/desktop.
DBCC freeproccache; go
The reason I am doing this and am telling you not to is that this will clear all plans out of the plan cache. This is bad if people are actually using it. There are ways to clear this down for just one plan so look this up in Books Online if you are doing this on a shared machine.
Now let’s turn on SET STATISTICS IO for an indication of how many logical reads we are performing and run the stored procedure:
SET STATISTICS IO ON; go EXEC DBO.ParamSniffDemo @ParamValue = 'Obscure'; go
For me this procedure returned 140 rows, your results may vary depending on what’s in TempDB at that time.
As we can see from the image above, this version of the stored procedure performed 299 logical reads.
The plan doesn’t look too bad either:
As you can see it is doing a key lookup which I covered on day 5 which could be avoided, but it’s having this lookup in here which makes this example come alive. To see what I mean by this we now need to run the same stored procedure again, but this time using the “Common” parameter which just happens to be most of the table.
EXEC DBO.ParamSniffDemo @ParamValue = 'Common'; go
When we look at the statistics for this unsurprisingly we have performed a lot more logical reads:
What does the query plan show us I wonder?
The plan looks remarkably similar in shape to the first one, and that’s because it is. It’s running exactly the same plan, problem is we are running this against a much bigger data set which means this isn’t the most optimal way.
How do I know it’s using the same plan? I can check inside the properties of the plan:
In the above image you can clearly see (click on the image to enlarge) that the runtime value was “Common” as we specified, but it was actually using the plan based upon the parameter “Obscure”. To prove just how different the skew in statistics is lets take a look at some more of the properties:
You can see that the optimizer was expecting 140 records but actually had 99860 records to work with this would put a hell of a lot of stress on this inefficient plan. Now you have two choices here, you can either believe me or we can clear the plan cache again and his time run the procedure again with the “Common” parameter first. What, you don’t believe me? Oh go on then…
dbcc freeproccache; go EXEC DBO.ParamSniffDemo @ParamValue = 'Common'; go EXEC DBO.ParamSniffDemo @ParamValue = 'Obscure'; go
As you can see I have run both one after the other, this makes comparisons easier and shortens the amount of time it takes me to write this blog.
First off let’s have a look at the number of logical reads, remember it was 299 for the “Obscure” compiled version with the runtime value of “Obscure” and 206220 for the “Obscure” compiled version with the runtime value of “Common”
Wow! That’s a big drop in logical reads for our “Common” parameter, I told you it was a bad plan. On the other side of the coin you can see that the “Obscure” version now uses more reads.
Now that we have observed a big shift in logical reads, let’s see what plan the optimizer has now chosen:
Interesting, the optimizer has decided that the tipping point has been reached and is now performing a clustered index scan. This version of the plan will provide us with a consistent execution time no matter what the parameter and so would be a better all round plan.
What are my options?
Scary stuff isn’t it! This may not be a problem for you right now, but one day it might and you’re going to want to know how to get around this kind of thing. Here are a couple of options.
Rewrite dynamic queries
When you write dynamic code it needs to be recompiled each time it is executed, if it needs to be recompiled then it won’t reside in the plan cache and you won’t suffer from the problem. This is not my favourite solution for a few reasons.
Firstly, you need to make sure that you don’t leave yourself open to SQL Injection.
Secondly, dynamic code is messy and can be hard to debug or alter at a later date.
Thirdly, it means that you miss out on using the plan cache which is a great optimization feature.
There are a couple of query hints that you can use such as:
Personally I kind of like this, it’s a safe middle ground option. You may be wondering why I said before recompiling is an overhead and now I’m reversing my opinion. The truth is I’m not backing out of what I said in dynamic queries, it is a better option and the reason is that now you can recompile just a single statement not the entire procedure and will reduce the CPU overhead.
Of the three options so far, this is my favourite and the reason for this is that there are just sooo many options you can choose from here. I could specify for a particular parameter to be used at compile time, in other words I am telling the optimizer which plan I want to use. The downside to this of course is that you really need to know your data which unfortunately in my experience not a lot of people do. You also need to seek help from your business to see where their priority lies. Do they have a particular value that needs to have an optimal plan. Is that the one you would have picked?
Then there’s the option to optimize for unknown which is pretty cool. Basically this is just going to let SQL Server look at the statistics and work out the average plan and go with that. This way (in theory) every plan should not be too far off it’s best plan.
All the options so far have been for people who have access to or can influence the writers of the code. There are of course a large number of companies who have third party tools and will have queries that suffer from this exact problem. For those people I present to you plan guides. Well the concept anyway, plan guides deserve a blog post of their own.
What plan guides allow you to do is to substitute a plan in the cache with one of your own making. They are a bit of work and you need to be ultra precise to make them work, but it does mean that you could add in your own join, index and query hints to make the query behave the way you believe it should have been written rather than the way it actually was.
Please check with your third party before you start doing this, they may deem it as a violation of their support agreement. If you phrase things right when you talk to them they probably won’t have a problem with it. This may save you waiting a long time for a fix you have requested.
Well that’s it for me, I’ve had a great two weeks with SQL Relay and had lots of fun presenting this session. If you attended any of them thank you very much and I hope to see you at a community event soon. For those of you who have stumbled across this via some search engine or syndication I hope you have enjoyed it.
That just leaves me to say a big thank you to all the organisers, speakers, helpers and sponsors of SQL Relay, you all did a marvellous job and it wouldn’t have happened without your hard work.
Keep checking http://www.SQLRelay.co.uk to watch out for further announcements of the next set of relay events. We have a few meetings over the next few weeks, but keep your diary free towards the end of the year for some more top quality free SQL Server events at a town near you.-