May 28, 2009 at 5:02 pm
Hi all,
I've seen several mentions of a "catch-all" query, where the "where" statement has several conditions in it. In doing this, the optimizer has to pick a plan that can handle every condition, usually meaning that the chosen plan is not optimum.
I've run across a query that does something like this...
CREATE PROC dbo.MyProc (
@SearchString varchar(2) = null,
@Value1 varchar(10) = null,
@Value2 varchar(10) = null,
@Value3 varchar(10))
AS
select *
from TableA
where ((@SearchString = 'us' and Col1 = @Value1) or
(@SearchString = 'ur' and Col1 = @Value1 and Col2 = @Value2) or
(@SearchString = 'ud' and Col1 = @Value1 and Col2 = @Value2 and Col3 = @Value3))
What I found was that even with a suitable index on these fields, the query always did an index scan.
When I separated the code out like this:
CREATE PROC dbo.MyProc (
@SearchString varchar(2) = null,
@Value1 varchar(10) = null,
@Value2 varchar(10) = null,
@Value3 varchar(10))
AS
if @SearchString = 'us'
select *
from TableA
where Col1 = @Value1
else if @SearchString = 'ur'
select *
from TableA
where Col1 = @Value1 and Col2 = @Value2
else if @SearchString = 'ud'
select *
from TableA
where Col1 = @Value1 and Col2 = @Value2 and Col3 = @Value3
Like this, each query now does an index seek. With the obvious much faster results, and much less reads.
So, the question is:
Is it preferable to leave the one procedure like this, or should each statement be sent individual procedures?
ie:
if @SearchString = 'us' execute dbo.MyNewProc1 @Value1
else if @SearchString = 'ur' execute dbo.MyNewProc2 @Value1, @Value2
else if @SearchString = 'ud' execute dbo.MyNewProc3 @Value1, @Value2, @Value3
I think it would be simpler to do the first way. But, which way is better for SQL?
Thanks,
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
May 28, 2009 at 9:51 pm
Wayne,
My guess is that the only place you will see the savings is in the compile time. Run both options with the SET STATISTICS TIME ON and see what you get. If the procedure was more complicated I would say go with the sub procedures but I'm pretty confident that you will find them to be so close in compile and execution time that the savings, if any, will be a nit.
If the complexity is greater than what is shown in your example then breaking them down into multiple procedures might result in some savings. Again, the statistics should reveal that.
Hope this helps.
David
@SQLTentmakerβHe is no fool who gives what he cannot keep to gain that which he cannot loseβ - Jim Elliot
May 29, 2009 at 1:57 am
Split them out into subprocedures. The reason is parameter sniffing.
When the proc is called the first time, the entire procedure is compiled and optimised based on the values of the procedures in that call. Even the branches that won't be executed are optimised and the value of the parameter may be one that the query would never run with.
Take this as a stupid example
CREATE PROCEDURE MultipleExecPaths (@TransactionType char(1) = NULL)
AS
IF @TransactionType IS NULL
SELECT max(transactionDate) from Production.TransactionHistory
ELSE
SELECT max(transactionDate) from Production.TransactionHistory where TransactionType = @TransactionType
Now, let's say that on the first execution the parameter is NULL. Both queries are optimised. The first one doesn't have parameters so that's easy. The second gets optimised based on a parameter value of NULL.
Now many rows do you think the optimiser will compile that for? (hint: how many rows would satisfy the predicate TransactionType = NULL)
What happens when the proc does get executed and a param is passed that will affect say 10000 rows?
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 29, 2009 at 2:01 am
WayneS (5/28/2009)
I've seen several mentions of a "catch-all" query, where the "where" statement has several conditions in it.
My definition of 'catch-all' (for what it's worth) is a query that has a where clause with this form. I know this performs badly, don't know about the other forms
WHERE
(@col1 IS NULL OR Col1 = @Col1)
AND
(@col2 IS NULL OR Col2 = @Col2)
AND
(@col3 IS NULL OR Col3 = @Col3)
AND
(@col4 IS NULL OR Col4 = @Col4)
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 29, 2009 at 5:39 am
GilaMonster (5/29/2009)
WayneS (5/28/2009)
I've seen several mentions of a "catch-all" query, where the "where" statement has several conditions in it.My definition of 'catch-all' (for what it's worth) is a query that has a where clause with this form. I know this performs badly, don't know about the other forms
WHERE
(@col1 IS NULL OR Col1 = @Col1)
AND
(@col2 IS NULL OR Col2 = @Col2)
AND
(@col3 IS NULL OR Col3 = @Col3)
AND
(@col4 IS NULL OR Col4 = @Col4)
So, by your definition (which is what I would use), my query above isn't a catch-all query. What would you call it then? "Set of queries highly prone to parameter sniffing"?
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
May 29, 2009 at 5:47 am
What I called it in my PASS presentation last year was "Multiple execution paths" (for the proc with multiple queries and if statements). Those are my names though. Call it whatever you like. Name doesn't affect how it performs. Your original is a variant of a 'catch-all'. Would need to play with it more to see how it performs under various circumstances.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 29, 2009 at 7:19 am
GilaMonster (5/29/2009)
WHERE
(@col1 IS NULL OR Col1 = @Col1)
AND
(@col2 IS NULL OR Col2 = @Col2)
AND
(@col3 IS NULL OR Col3 = @Col3)
AND
(@col4 IS NULL OR Col4 = @Col4)
What do you think of this pattern?WHERE Col1 = ISNULL(@col1, Col1)
AND ....When there are multiple optional parameters, I've sometimes run a query into a temp table to get the keys (when I know there will be only a few), then joined them together at the end, but now I'm thinking I would run into the parameter sniffing problem there, something I hadn't considered.
Chad
May 29, 2009 at 7:49 am
Not much.
Using the same query framework that I used in my blog post on this (http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/)
CREATE PROCEDURE SearchHistory_IsNull (@Product int = NULL, @OrderID int = NULL, @TransactionType char(1) = NULL, @Qty int = NULL)
AS
SELECT ProductID, ReferenceOrderID, TransactionType, Quantity, TransactionDate, ActualCost from Production.TransactionHistory
WHERE (ProductID = ISNULL(@Product,ProductID))
AND (ReferenceOrderID = ISNULL(@OrderID,ReferenceOrderID))
AND (TransactionType = ISNULL(@TransactionType, TransactionType))
AND (Quantity = ISNULL(@Qty,Quantity))
GO
EXEC SearchHistory_IsNull @Product = 978, @TransactionType = 'W'
Clustered index scan (There is a usable index on ProductID)
Table 'TransactionHistory'. Scan count 1, logical reads 6712, physical reads 0
SQL Server Execution Times:
CPU time = 125 ms, elapsed time = 135 ms.
EXEC SearchHistory_IsNull @Qty = 100
Clustered index scan.
Table 'TransactionHistory'. Scan count 1, logical reads 6712, physical reads 0
SQL Server Execution Times:
CPU time = 344 ms, elapsed time = 334 ms.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 29, 2009 at 7:59 am
GilaMonster (5/29/2009)
Not much....
I guess all it has going for it is "looking cool". π
This is an area I need more study in - I've been planning to go out and dig into your documentation in earnest and it looks like I need to make the time to do it.
Gianluca's link looks good too, but I'm going to have to sit down and run through the examples to really understand and internalize it.
Wayne - thanks for letting me interject... we now return you to your previously scheduled programming...
Chad
May 29, 2009 at 7:34 pm
Chad Crawford (5/29/2009)
GilaMonster (5/29/2009)
Not much....
Wayne - thanks for letting me interject... we now return you to your previously scheduled programming...
Chad
Not a problem... we all get to more learn, and it helps out all of us.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
May 30, 2009 at 2:45 am
GilaMonster (5/29/2009)
Split them out into subprocedures. The reason is parameter sniffing.
+1. Splitting into sub-procedures is the best solution.
Your other alternatives are:
1. RECOMPILE on the EXEC call, on the procedure, or on the individual statements
2. OPTIMIZE FOR
3. Copy the parameter values into local variables and reference them instead
4. USE PLAN
5. Plan guide
And probably some others. All have problems of their own. π
Paul
May 30, 2009 at 7:10 am
The thing that bugs me, that I don't really understand, is:
If I have a table:
CREATE TABLE dbo.TableA (
Col1 varchar(10),
Col2 varchar(10),
Col3 varchar(10))
CREATE INDEX [IX_TableA_Cover] ON dbo.TableA (Col1, Col2, Col3)
This index would handle searches on:
Col1, or
Col1 & Col2, or
Col1 & Col2 & Col3
So, back to the original proc I had, why would I get table scans for a where condition of:
where (@SearchType = 'us' and Col1 = @Value1) or
(@SearchType = 'ud' and Col1 = @Value1 and Col2 = @Value2) or
(@SearchType = 'uc' and Col1 = @Value1 and Col2 = @Value2 and Col3 = @Value3)
In any of these conditions, the best plan is to do an indexed seek.
Now, I understand if different indexes were involved. Or if I was skipping over columns in the index. But not when any condition just uses the next column in the same index.
I guess it's just one of those things I just need to accept, but it would be nice to have it make more sense to me.
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
May 30, 2009 at 8:32 am
WayneS (5/30/2009)
The thing that bugs me, that I don't really understand, is:In any of these conditions, the best plan is to do an indexed seek.
Yes, but if it does the plan is not a reusable one. Let's say that the first call, the one that triggers the compile has a search type of 'uc' and all three parameters passed. If SQL were to chose an index seek, the seek predicate (what it's going down the b-tree to find) would be "Col1 = @Value1 AND Col2 = @Value2 AND Col3 = @Value3". All well and good.
Let's say the next call has a search type of 'us' and only Col1 is passed. The seek predicate is now invalid for the query and the plan cannot be used.
In cases like these, the optimiser goes for a safe approach. A plan that is always valid no matter what combination of columns and constants is passed. It goes for a clustered index scan and a predicate (not a seek predicate) applied after the scan that matches the entire where clause (including constants)
The one rule that the optimiser has is that different parameter values may not cause a query plan to be invalid. A cached plan must be reusable no matter what parameters are passed.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
May 30, 2009 at 10:13 am
GilaMonster (5/30/2009)
WayneS (5/30/2009)
The thing that bugs me, that I don't really understand, is:In any of these conditions, the best plan is to do an indexed seek.
Yes, but if it does the plan is not a reusable one.
Okay, I think I get it now. I wasn't thinking "big picture". It's the whole reusability aspect that I wasn't thinking about.
Thanks Gail!!!
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply