The Elusive Conditional WHERE Clause

  • Comments posted to this topic are about the item The Elusive Conditional WHERE Clause

  • solid approach to conditional where clause. I might suggest WITH Recompile as an option since any saved execution plan may not be optimal dependent on the where conditions which are active each time the sproc executes

  • I love this. Thanks for sharing it!

  • Old stuff!

    Dynamic sql will run better.

  • Thanks, it's an interesting approach.

    Just one question: Have you tested index use with this approach?

    The reason I ask is that some years back I would construct WHERE clauses as follows:

    WHERE

    --If a param is not null, then it will be used as a filter

    (@Param1 IS NULL OR COL1 = @Param1)

    AND (@Param2 IS NULL OR COL2 = @Param2)

    AND (@Param3 IS NULL OR COL3 = @Param3)

    --etc.

    I discarded the approach because the optimiser was incapable of interpreting the pattern to choose appropriate indexes.

  • Not brand new technique, but useful indeed.

    Same approach can also be used to parametrize the order by clause...

  • Super idea - haven't seen it before

  • Could you please an example of how to use this idea to do conditional ORDER BY.

    Thanks in advance

  • John Hick-456673 (4/17/2013)


    Comments posted to this topic are about the item <A HREF="/articles/T-SQL/97908/">The Elusive Conditional WHERE Clause</A>

    As long as there is always the same column used then it is fine and won't need a recompile.

    However this part:

    or (7=@pUseCondition and d1.[WeekDay] in ('tue','thu'))

    or (8=@pUseCondition and d.[Date]>=@pEarliestDate)

    or (9=@pUseCondition and d1.[WeekDay] in ('mon','fri'))

    sticks out and on big data sets might show the use of a wrong query plan. It's these parts that make dynamic SQL with an forced recompile more useful (but harder to evaluate and maintain). And where a recompile hint becomes useful.

    The advantage is that the code is written in one block and thus easier to maintain and test (in my opinion) - which in itself is a good thing. The statement is all there unlike in dynamic SQL which is only known (and maybe shown) at execution.

  • Carlo Romagnano (4/18/2013)


    Old stuff!

    Dynamic sql will run better.

    Sorry, all you are saying is that a red sky is always better than a green sky. But not why.

    Everyone knows that a sky that is red is not that good (unless it is sun set or sun rise) and a green sky needs always taken with a precaution.

    Dynamic SQL has its own share of problems and is generally for more advanced developers - so if a rookie needs to maintain it then all bets are off if it works afterwards.

  • Knut Boehnert (4/18/2013)


    Carlo Romagnano (4/18/2013)


    Old stuff!

    Dynamic sql will run better.

    Sorry, all you are saying is that a red sky is always better than a green sky. But not why.

    Everyone knows that a sky that is red is not that good (unless it is sun set or sun rise) and a green sky needs always taken with a precaution.

    Dynamic SQL has its own share of problems and is generally for more advanced developers - so if a rookie needs to maintain it then all bets are off if it works afterwards.

    Take this example:

    [sql]

    select colA, colB from mytab

    WHERE

    --If a param is not null, then it will be used as a filter

    (@Param1 IS NULL OR COL1 = @Param1)

    AND (@Param2 IS NULL OR COL2 = @Param2)

    AND (@Param3 IS NULL OR COL3 = @Param3)

    --etc.

    [/sql]

    In this case the optimizer should read COL1, COL2, COL3 also if @Parm1, @Parm2, @Parm3 IS NULL.

    So, you force a useless read.

    Instead,

    [sql]

    declare @sql varchar(8000) = 'select colA, colB from mytab'

    if @Parm1 IS NOT NULL

    SET @sql = ' WHERE COL1 = ' + @Param1

    exec sp_executesql @sql

    [/sql]

    In this case, the optimizer read only COL1.

  • Just by looking at the code constructs, I recognise I do use this technique sometimes, but with severe moderation. Just in cases where there are other, more straightforward filters present that will reduce the result set before the 'dynamic' conditions have to kick in and only if those conditions themselfs are trival.

    Even if you use variables to make the input and thus the selections dynamic, the query plan by default will be optimized for exactly the input values you hand the optimizer the very first time you invoke the query. Assuming you use parameterized queries, which you should. To treat all input equal, every time, SQL Server 2008 introduced OPTIMIZE FOR UNKNOWN, as an query option:

    your query

    option

    ( optimize for unknown )

    ;

    The optimizer not being able to throw away a conditional condition (yes i wrote it), does not make it sargable. For that it would need to be able to make use of an index of sorts to narrow results and seek/filter quickly. A variable by very nature is not something to be searched in (excluding table variables), so calling it sargable is kind of odd.

    See: http://en.wikipedia.org/wiki/Sargable

    You should examine in a big test case, how many pages are read and if this matches what you predict based on expected behavior to make sure you have the desired effect. If it does not and reads more then expected amounts of data to filter, you just have a syntactically convienient way of having many conditional conditions.

    The alternative of Dynamic SQL written in a stored proc, I do not like much either, as that too is messy to maintain code wise. I myself add/remove conditions in the application layer where it is more natural and can be even more dynamic. At least in the language i work in, but with some concessions that same technique can be made to work in just about any mainstream language quite well.

    I am also not really a big fan, nor frequent user of stored procedures. But understand their role in other scenarios then mine where there is no proper alternative. In such cases I would go for the dynamic SQL solution for complex cases and for simple ones stick to conditionally select a query. Either solution is just a wrapper really and thus overhead by very nature.

  • As for your challenge:

    Before we “take a deep dive” into this, allow me to explain the sproc, a bit. The basic design was put to me as a challenge, not long ago, to create a list of university school weeks by WeekNumber and DayOfTheWeekNumber, i.e., Monday of the first week of school would be W1D1, Tuesday would be W1D2, etc. This was to be done using a Common Table Expression (CTE). This is probably the only way to perform this (here’s another challenge for you: can you do this WITHOUT using a CTE?).

    DECLARE @FirstDayOfYear DATE, @DateToCheck DATE

    SET @FirstDayOfYear = '20130101'

    SET @DateToCheck = '20130101'

    SELECT (DATEPART(DAYOFYEAR, @DateToCheck)/7) % 52 +1 AS W, (DATEPART(DAYOFYEAR, @DateToCheck) + DATEPART(dw,@FirstDayOfYear)-2) % 7 + 1 AS D

    I didn't test the performance, but a CTE is very seldom the only way to do things.

  • Carlo Romagnano (4/18/2013)


    Knut Boehnert (4/18/2013)


    Carlo Romagnano (4/18/2013)


    Old stuff!

    Dynamic sql will run better.

    Sorry, all you are saying is that a red sky is always better than a green sky. But not why.

    Everyone knows that a sky that is red is not that good (unless it is sun set or sun rise) and a green sky needs always taken with a precaution.

    Dynamic SQL has its own share of problems and is generally for more advanced developers - so if a rookie needs to maintain it then all bets are off if it works afterwards.

    Take this example:

    [sql]

    select colA, colB from mytab

    WHERE

    --If a param is not null, then it will be used as a filter

    (@Param1 IS NULL OR COL1 = @Param1)

    AND (@Param2 IS NULL OR COL2 = @Param2)

    AND (@Param3 IS NULL OR COL3 = @Param3)

    --etc.

    [/sql]

    In this case the optimizer should read COL1, COL2, COL3 also if @Parm1, @Parm2, @Parm3 IS NULL.

    So, you force a useless read.

    Instead,

    [sql]

    declare @sql varchar(8000) = 'select colA, colB from mytab'

    if @Parm1 IS NOT NULL

    SET @sql = ' WHERE COL1 = ' + @Param1

    exec sp_executesql @sql

    [/sql]

    In this case, the optimizer read only COL1.

    Sure enough, in this simple example dynamic SQL is better because of the evaluation of @Param1, @Param2 and @Param3 - I fully agree with this.

    However the SQL in John's case is more elaborate.

    So [challenge on] - how would you do a better dynamic SQL for John's case, Carlo? Because you say "dynamic SQL will run better" I am very interested in how dynamic SQL can achieve this in John's scenario.

    Just a few pointers for no more than 15 minutes thinking are fine. Just something thought provoking to find maybe a better solution.

  • Is this not just a catch all query? In which case, check out one of Gail's blog posts on the subject --> http://sqlinthewild.co.za/index.php/2009/03/19/catch-all-queries/%5B/url%5D.


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 15 posts - 1 through 15 (of 79 total)

You must be logged in to reply to this topic. Login to reply