Resolving a "catch-all" query.

  • Hi all,

    I've seen several mentions of a "catch-all" query, where the "where" statement has several conditions in it. In doing this, the optimizer has to pick a plan that can handle every condition, usually meaning that the chosen plan is not optimum.

    I've run across a query that does something like this...

    CREATE PROC dbo.MyProc (

    @SearchString varchar(2) = null,

    @Value1 varchar(10) = null,

    @Value2 varchar(10) = null,

    @Value3 varchar(10))

    AS

    select *

    from TableA

    where ((@SearchString = 'us' and Col1 = @Value1) or

    (@SearchString = 'ur' and Col1 = @Value1 and Col2 = @Value2) or

    (@SearchString = 'ud' and Col1 = @Value1 and Col2 = @Value2 and Col3 = @Value3))

    What I found was that even with a suitable index on these fields, the query always did an index scan.

    When I separated the code out like this:

    CREATE PROC dbo.MyProc (

    @SearchString varchar(2) = null,

    @Value1 varchar(10) = null,

    @Value2 varchar(10) = null,

    @Value3 varchar(10))

    AS

    if @SearchString = 'us'

    select *

    from TableA

    where Col1 = @Value1

    else if @SearchString = 'ur'

    select *

    from TableA

    where Col1 = @Value1 and Col2 = @Value2

    else if @SearchString = 'ud'

    select *

    from TableA

    where Col1 = @Value1 and Col2 = @Value2 and Col3 = @Value3

    Like this, each query now does an index seek. With the obvious much faster results, and much less reads.

    So, the question is:

    Is it preferable to leave the one procedure like this, or should each statement be sent individual procedures?

    ie:

    if @SearchString = 'us' execute dbo.MyNewProc1 @Value1

    else if @SearchString = 'ur' execute dbo.MyNewProc2 @Value1, @Value2

    else if @SearchString = 'ud' execute dbo.MyNewProc3 @Value1, @Value2, @Value3

    I think it would be simpler to do the first way. But, which way is better for SQL?

    Thanks,

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Wayne,

    My guess is that the only place you will see the savings is in the compile time. Run both options with the SET STATISTICS TIME ON and see what you get. If the procedure was more complicated I would say go with the sub procedures but I'm pretty confident that you will find them to be so close in compile and execution time that the savings, if any, will be a nit.

    If the complexity is greater than what is shown in your example then breaking them down into multiple procedures might result in some savings. Again, the statistics should reveal that.

    Hope this helps.

    David

    @SQLTentmaker

    β€œHe is no fool who gives what he cannot keep to gain that which he cannot lose” - Jim Elliot

  • I recently had the chance to read this blog on the topic and I found really helpful.

    Hope you find it helpful too.

    Gianluca

    -- Gianluca Sartori

  • Split them out into subprocedures. The reason is parameter sniffing.

    When the proc is called the first time, the entire procedure is compiled and optimised based on the values of the procedures in that call. Even the branches that won't be executed are optimised and the value of the parameter may be one that the query would never run with.

    Take this as a stupid example

    CREATE PROCEDURE MultipleExecPaths (@TransactionType char(1) = NULL)

    AS

    IF @TransactionType IS NULL

    SELECT max(transactionDate) from Production.TransactionHistory

    ELSE

    SELECT max(transactionDate) from Production.TransactionHistory where TransactionType = @TransactionType

    Now, let's say that on the first execution the parameter is NULL. Both queries are optimised. The first one doesn't have parameters so that's easy. The second gets optimised based on a parameter value of NULL.

    Now many rows do you think the optimiser will compile that for? (hint: how many rows would satisfy the predicate TransactionType = NULL)

    What happens when the proc does get executed and a param is passed that will affect say 10000 rows?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • WayneS (5/28/2009)


    I've seen several mentions of a "catch-all" query, where the "where" statement has several conditions in it.

    My definition of 'catch-all' (for what it's worth) is a query that has a where clause with this form. I know this performs badly, don't know about the other forms

    WHERE

    (@col1 IS NULL OR Col1 = @Col1)

    AND

    (@col2 IS NULL OR Col2 = @Col2)

    AND

    (@col3 IS NULL OR Col3 = @Col3)

    AND

    (@col4 IS NULL OR Col4 = @Col4)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (5/29/2009)


    WayneS (5/28/2009)


    I've seen several mentions of a "catch-all" query, where the "where" statement has several conditions in it.

    My definition of 'catch-all' (for what it's worth) is a query that has a where clause with this form. I know this performs badly, don't know about the other forms

    WHERE

    (@col1 IS NULL OR Col1 = @Col1)

    AND

    (@col2 IS NULL OR Col2 = @Col2)

    AND

    (@col3 IS NULL OR Col3 = @Col3)

    AND

    (@col4 IS NULL OR Col4 = @Col4)

    So, by your definition (which is what I would use), my query above isn't a catch-all query. What would you call it then? "Set of queries highly prone to parameter sniffing"?

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • What I called it in my PASS presentation last year was "Multiple execution paths" (for the proc with multiple queries and if statements). Those are my names though. Call it whatever you like. Name doesn't affect how it performs. Your original is a variant of a 'catch-all'. Would need to play with it more to see how it performs under various circumstances.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (5/29/2009)


    WHERE

    (@col1 IS NULL OR Col1 = @Col1)

    AND

    (@col2 IS NULL OR Col2 = @Col2)

    AND

    (@col3 IS NULL OR Col3 = @Col3)

    AND

    (@col4 IS NULL OR Col4 = @Col4)

    What do you think of this pattern?WHERE Col1 = ISNULL(@col1, Col1)

    AND ....When there are multiple optional parameters, I've sometimes run a query into a temp table to get the keys (when I know there will be only a few), then joined them together at the end, but now I'm thinking I would run into the parameter sniffing problem there, something I hadn't considered.

    Chad

  • Not much.

    Using the same query framework that I used in my blog post on this (http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/)

    CREATE PROCEDURE SearchHistory_IsNull (@Product int = NULL, @OrderID int = NULL, @TransactionType char(1) = NULL, @Qty int = NULL)

    AS

    SELECT ProductID, ReferenceOrderID, TransactionType, Quantity, TransactionDate, ActualCost from Production.TransactionHistory

    WHERE (ProductID = ISNULL(@Product,ProductID))

    AND (ReferenceOrderID = ISNULL(@OrderID,ReferenceOrderID))

    AND (TransactionType = ISNULL(@TransactionType, TransactionType))

    AND (Quantity = ISNULL(@Qty,Quantity))

    GO

    EXEC SearchHistory_IsNull @Product = 978, @TransactionType = 'W'

    Clustered index scan (There is a usable index on ProductID)

    Table 'TransactionHistory'. Scan count 1, logical reads 6712, physical reads 0

    SQL Server Execution Times:

    CPU time = 125 ms, elapsed time = 135 ms.

    EXEC SearchHistory_IsNull @Qty = 100

    Clustered index scan.

    Table 'TransactionHistory'. Scan count 1, logical reads 6712, physical reads 0

    SQL Server Execution Times:

    CPU time = 344 ms, elapsed time = 334 ms.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (5/29/2009)


    Not much.

    ...

    I guess all it has going for it is "looking cool". πŸ™‚

    This is an area I need more study in - I've been planning to go out and dig into your documentation in earnest and it looks like I need to make the time to do it.

    Gianluca's link looks good too, but I'm going to have to sit down and run through the examples to really understand and internalize it.

    Wayne - thanks for letting me interject... we now return you to your previously scheduled programming...

    Chad

  • Chad Crawford (5/29/2009)


    GilaMonster (5/29/2009)


    Not much.

    ...

    Wayne - thanks for letting me interject... we now return you to your previously scheduled programming...

    Chad

    Not a problem... we all get to more learn, and it helps out all of us.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • GilaMonster (5/29/2009)


    Split them out into subprocedures. The reason is parameter sniffing.

    +1. Splitting into sub-procedures is the best solution.

    Your other alternatives are:

    1. RECOMPILE on the EXEC call, on the procedure, or on the individual statements

    2. OPTIMIZE FOR

    3. Copy the parameter values into local variables and reference them instead

    4. USE PLAN

    5. Plan guide

    And probably some others. All have problems of their own. πŸ™‚

    Paul

  • The thing that bugs me, that I don't really understand, is:

    If I have a table:

    CREATE TABLE dbo.TableA (

    Col1 varchar(10),

    Col2 varchar(10),

    Col3 varchar(10))

    CREATE INDEX [IX_TableA_Cover] ON dbo.TableA (Col1, Col2, Col3)

    This index would handle searches on:

    Col1, or

    Col1 & Col2, or

    Col1 & Col2 & Col3

    So, back to the original proc I had, why would I get table scans for a where condition of:

    where (@SearchType = 'us' and Col1 = @Value1) or

    (@SearchType = 'ud' and Col1 = @Value1 and Col2 = @Value2) or

    (@SearchType = 'uc' and Col1 = @Value1 and Col2 = @Value2 and Col3 = @Value3)

    In any of these conditions, the best plan is to do an indexed seek.

    Now, I understand if different indexes were involved. Or if I was skipping over columns in the index. But not when any condition just uses the next column in the same index.

    I guess it's just one of those things I just need to accept, but it would be nice to have it make more sense to me.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (5/30/2009)


    The thing that bugs me, that I don't really understand, is:

    In any of these conditions, the best plan is to do an indexed seek.

    Yes, but if it does the plan is not a reusable one. Let's say that the first call, the one that triggers the compile has a search type of 'uc' and all three parameters passed. If SQL were to chose an index seek, the seek predicate (what it's going down the b-tree to find) would be "Col1 = @Value1 AND Col2 = @Value2 AND Col3 = @Value3". All well and good.

    Let's say the next call has a search type of 'us' and only Col1 is passed. The seek predicate is now invalid for the query and the plan cannot be used.

    In cases like these, the optimiser goes for a safe approach. A plan that is always valid no matter what combination of columns and constants is passed. It goes for a clustered index scan and a predicate (not a seek predicate) applied after the scan that matches the entire where clause (including constants)

    The one rule that the optimiser has is that different parameter values may not cause a query plan to be invalid. A cached plan must be reusable no matter what parameters are passed.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (5/30/2009)


    WayneS (5/30/2009)


    The thing that bugs me, that I don't really understand, is:

    In any of these conditions, the best plan is to do an indexed seek.

    Yes, but if it does the plan is not a reusable one.

    Okay, I think I get it now. I wasn't thinking "big picture". It's the whole reusability aspect that I wasn't thinking about.

    Thanks Gail!!!

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply