Resolving a "catch-all" query.

Question

Resolving a "catch-all" query.

WayneS

SSC Guru

Points: 95461
More actions
May 28, 2009 at 5:02 pm

#210129

Hi all,
I've seen several mentions of a "catch-all" query, where the "where" statement has several conditions in it. In doing this, the optimizer has to pick a plan that can handle every condition, usually meaning that the chosen plan is not optimum.
I've run across a query that does something like this...
CREATE PROC dbo.MyProc (
@SearchString varchar(2) = null,
@Value1 varchar(10) = null,
@Value2 varchar(10) = null,
@Value3 varchar(10))
AS
select *
from TableA
where ((@SearchString = 'us' and Col1 = @Value1) or
(@SearchString = 'ur' and Col1 = @Value1 and Col2 = @Value2) or
(@SearchString = 'ud' and Col1 = @Value1 and Col2 = @Value2 and Col3 = @Value3))
What I found was that even with a suitable index on these fields, the query always did an index scan.
When I separated the code out like this:
CREATE PROC dbo.MyProc (
@SearchString varchar(2) = null,
@Value1 varchar(10) = null,
@Value2 varchar(10) = null,
@Value3 varchar(10))
AS
if @SearchString = 'us'
select *
from TableA
where Col1 = @Value1
else if @SearchString = 'ur'
select *
from TableA
where Col1 = @Value1 and Col2 = @Value2
else if @SearchString = 'ud'
select *
from TableA
where Col1 = @Value1 and Col2 = @Value2 and Col3 = @Value3
Like this, each query now does an index seek. With the obvious much faster results, and much less reads.
So, the question is:
Is it preferable to leave the one procedure like this, or should each statement be sent individual procedures?
ie:
if @SearchString = 'us' execute dbo.MyNewProc1 @Value1
else if @SearchString = 'ur' execute dbo.MyNewProc2 @Value1, @Value2
else if @SearchString = 'ud' execute dbo.MyNewProc3 @Value1, @Value2, @Value3
I think it would be simpler to do the first way. But, which way is better for SQL?
Thanks,
Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes
If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

David Benoit SSC-Dedicated Points: 34562 More actions · Answer 1

Wayne,

My guess is that the only place you will see the savings is in the compile time. Run both options with the SET STATISTICS TIME ON and see what you get. If the procedure was more complicated I would say go with the sub procedures but I'm pretty confident that you will find them to be so close in compile and execution time that the savings, if any, will be a nit.

If the complexity is greater than what is shown in your example then breaking them down into multiple procedures might result in some savings. Again, the statistics should reveal that.

Hope this helps.

David

@SQLTentmaker

“He is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

spaghettidba SSC Guru Points: 105732 More actions · Answer 2

I recently had the chance to read this blog on the topic and I found really helpful.

Hope you find it helpful too.

Gianluca

-- Gianluca Sartori

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 3

Split them out into subprocedures. The reason is parameter sniffing.

When the proc is called the first time, the entire procedure is compiled and optimised based on the values of the procedures in that call. Even the branches that won't be executed are optimised and the value of the parameter may be one that the query would never run with.

Take this as a stupid example

CREATE PROCEDURE MultipleExecPaths (@TransactionType char(1) = NULL)

AS

IF @TransactionType IS NULL

SELECT max(transactionDate) from Production.TransactionHistory

ELSE

SELECT max(transactionDate) from Production.TransactionHistory where TransactionType = @TransactionType

Now, let's say that on the first execution the parameter is NULL. Both queries are optimised. The first one doesn't have parameters so that's easy. The second gets optimised based on a parameter value of NULL.

Now many rows do you think the optimiser will compile that for? (hint: how many rows would satisfy the predicate TransactionType = NULL)

What happens when the proc does get executed and a param is passed that will affect say 10000 rows?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 4

WayneS (5/28/2009)
I've seen several mentions of a "catch-all" query, where the "where" statement has several conditions in it.

My definition of 'catch-all' (for what it's worth) is a query that has a where clause with this form. I know this performs badly, don't know about the other forms

WHERE

(@col1 IS NULL OR Col1 = @Col1)

AND

(@col2 IS NULL OR Col2 = @Col2)

AND

(@col3 IS NULL OR Col3 = @Col3)

AND

(@col4 IS NULL OR Col4 = @Col4)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

WayneS SSC Guru Points: 95461 More actions · Answer 5

GilaMonster (5/29/2009)
WayneS (5/28/2009)
I've seen several mentions of a "catch-all" query, where the "where" statement has several conditions in it.
My definition of 'catch-all' (for what it's worth) is a query that has a where clause with this form. I know this performs badly, don't know about the other forms
WHERE
(@col1 IS NULL OR Col1 = @Col1)
AND
(@col2 IS NULL OR Col2 = @Col2)
AND
(@col3 IS NULL OR Col3 = @Col3)
AND
(@col4 IS NULL OR Col4 = @Col4)

So, by your definition (which is what I would use), my query above isn't a catch-all query. What would you call it then? "Set of queries highly prone to parameter sniffing"?

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 6

What I called it in my PASS presentation last year was "Multiple execution paths" (for the proc with multiple queries and if statements). Those are my names though. Call it whatever you like. Name doesn't affect how it performs. Your original is a variant of a 'catch-all'. Would need to play with it more to see how it performs under various circumstances.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Chad Crawford SSChampion Points: 11610 More actions · Answer 7

GilaMonster (5/29/2009)
WHERE (@col1 IS NULL OR Col1 = @Col1) AND (@col2 IS NULL OR Col2 = @Col2) AND (@col3 IS NULL OR Col3 = @Col3) AND
(@col4 IS NULL OR Col4 = @Col4)

What do you think of this pattern?WHERE Col1 = ISNULL(@col1, Col1)

AND ....When there are multiple optional parameters, I've sometimes run a query into a temp table to get the keys (when I know there will be only a few), then joined them together at the end, but now I'm thinking I would run into the parameter sniffing problem there, something I hadn't considered.

Chad

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 8

Not much.

Using the same query framework that I used in my blog post on this (http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/)

CREATE PROCEDURE SearchHistory_IsNull (@Product int = NULL, @OrderID int = NULL, @TransactionType char(1) = NULL, @Qty int = NULL)

AS

SELECT ProductID, ReferenceOrderID, TransactionType, Quantity, TransactionDate, ActualCost from Production.TransactionHistory

WHERE (ProductID = ISNULL(@Product,ProductID))

AND (ReferenceOrderID = ISNULL(@OrderID,ReferenceOrderID))

AND (TransactionType = ISNULL(@TransactionType, TransactionType))

AND (Quantity = ISNULL(@Qty,Quantity))

GO

EXEC SearchHistory_IsNull @Product = 978, @TransactionType = 'W'

Clustered index scan (There is a usable index on ProductID)

Table 'TransactionHistory'. Scan count 1, logical reads 6712, physical reads 0

SQL Server Execution Times:

CPU time = 125 ms, elapsed time = 135 ms.

EXEC SearchHistory_IsNull @Qty = 100

Clustered index scan.

Table 'TransactionHistory'. Scan count 1, logical reads 6712, physical reads 0

SQL Server Execution Times:

CPU time = 344 ms, elapsed time = 334 ms.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Chad Crawford SSChampion Points: 11610 More actions · Answer 9

GilaMonster (5/29/2009)
Not much.
...

I guess all it has going for it is "looking cool". 🙂

This is an area I need more study in - I've been planning to go out and dig into your documentation in earnest and it looks like I need to make the time to do it.

Gianluca's link looks good too, but I'm going to have to sit down and run through the examples to really understand and internalize it.

Wayne - thanks for letting me interject... we now return you to your previously scheduled programming...

Chad

WayneS SSC Guru Points: 95461 More actions · Answer 10

Chad Crawford (5/29/2009)
GilaMonster (5/29/2009)
Not much.
...
Wayne - thanks for letting me interject... we now return you to your previously scheduled programming...
Chad

Not a problem... we all get to more learn, and it helps out all of us.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Paul White SSC Guru Points: 150468 More actions · Answer 11

GilaMonster (5/29/2009)
Split them out into subprocedures. The reason is parameter sniffing.

+1. Splitting into sub-procedures is the best solution.

Your other alternatives are:

1. RECOMPILE on the EXEC call, on the procedure, or on the individual statements

2. OPTIMIZE FOR

3. Copy the parameter values into local variables and reference them instead

4. USE PLAN

5. Plan guide

And probably some others. All have problems of their own. 🙂

Paul

Paul White
All articles available on SQL.kiwi
@SQL_Kiwi

WayneS SSC Guru Points: 95461 More actions · Answer 12

The thing that bugs me, that I don't really understand, is:

If I have a table:

CREATE TABLE dbo.TableA (

Col1 varchar(10),

Col2 varchar(10),

Col3 varchar(10))

CREATE INDEX [IX_TableA_Cover] ON dbo.TableA (Col1, Col2, Col3)

This index would handle searches on:

Col1, or

Col1 & Col2, or

Col1 & Col2 & Col3

So, back to the original proc I had, why would I get table scans for a where condition of:

where (@SearchType = 'us' and Col1 = @Value1) or

(@SearchType = 'ud' and Col1 = @Value1 and Col2 = @Value2) or

(@SearchType = 'uc' and Col1 = @Value1 and Col2 = @Value2 and Col3 = @Value3)

In any of these conditions, the best plan is to do an indexed seek.

Now, I understand if different indexes were involved. Or if I was skipping over columns in the index. But not when any condition just uses the next column in the same index.

I guess it's just one of those things I just need to accept, but it would be nice to have it make more sense to me.

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 13

WayneS (5/30/2009)
The thing that bugs me, that I don't really understand, is:
In any of these conditions, the best plan is to do an indexed seek.

Yes, but if it does the plan is not a reusable one. Let's say that the first call, the one that triggers the compile has a search type of 'uc' and all three parameters passed. If SQL were to chose an index seek, the seek predicate (what it's going down the b-tree to find) would be "Col1 = @Value1 AND Col2 = @Value2 AND Col3 = @Value3". All well and good.

Let's say the next call has a search type of 'us' and only Col1 is passed. The seek predicate is now invalid for the query and the plan cannot be used.

In cases like these, the optimiser goes for a safe approach. A plan that is always valid no matter what combination of columns and constants is passed. It goes for a clustered index scan and a predicate (not a seek predicate) applied after the scan that matches the entire where clause (including constants)

The one rule that the optimiser has is that different parameter values may not cause a query plan to be invalid. A cached plan must be reusable no matter what parameters are passed.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

WayneS SSC Guru Points: 95461 More actions · Answer 14

GilaMonster (5/30/2009)
WayneS (5/30/2009)
The thing that bugs me, that I don't really understand, is:
In any of these conditions, the best plan is to do an indexed seek.
Yes, but if it does the plan is not a reusable one.

Okay, I think I get it now. I wasn't thinking "big picture". It's the whole reusability aspect that I wasn't thinking about.

Thanks Gail!!!

Wayne
Microsoft Certified Master: SQL Server 2008
Author - SQL Server T-SQL Recipes

If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!

Links:
For better assistance in answering your questions
Performance Problems
Common date/time routines
Understanding and Using APPLY Part 1 & Part 2