The Elusive Conditional WHERE Clause

  • The posts from peter-757102 and Sean Lange say it all. In a nutshell, this approach will almost always cause index/table scans. Not an issue for a small data set like the one in this article, but a potential killer when large tables are involved.

  • SalvageDog (4/18/2013)


    The posts from peter-757102 and Sean Lange say it all. In a nutshell, this approach will almost always cause index/table scans. Not an issue for a small data set like the one in this article, but a potential killer when large tables are involved.

    And it isn't just an IO performance killer with those SCAN plans.

    1) more useful pages get forced out of the buffer pool, causing slowness across the board

    2) if there are JOINs involved, you will get potentially very large CPU (and tempdb) killing HASH joins

    3) all those scans take shared locks, thus killing concurrency by preventing all INSERT/UPDATE/DELETE activity on the objects involved for the duration of the SELECT query

    I have seen this type of query pattern bring even moderately-sized systems to an UNUSABLE STATE under relatively low usage. It is a HORRIBLY BAD pattern that should be avoided on anything but the smallest and lowest-used systems.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Alternate for generating the week/day pairs (probably need to adjust dates to start of school year, rather than beginning of calendar year):

    WITH dates(DT) AS

    (

    SELECT CONVERT(date,DATEADD(day,-7,GETDATE())) as DT

    UNION ALL

    SELECT DATEADD(d,1,DT)

    FROM dates

    WHERE DT < CONVERT(date,DATEADD(year,1,GETDATE()))

    )

    SELECT DT,'W'+CONVERT(nvarchar,DATEPART(wk,DT))+'D'+CONVERT(nvarchar,DATEPART(dw,DT))

    FROM dates

    OPTION (MAXRECURSION 0)

  • luke.baughan (4/18/2013)


    Great article - though I did find myself wading through the rather large example. Have written this much smaller example...

    Best post! Thank you. The article didn't really make sense to me, but your example was very helpful. I will definitely try this!

    What is the significance of wording the where clause as "1=@variable" instead of "@variable=1"?


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • John Hick-456673 (4/18/2013)


    I have tested it against large tables (1 billion plus rows) and performance is much better than with dynamic SQL.

    I'm going to be blunt here.

    I find that extremely hard to believe.

    The only overhead dynamic SQL would have is compile time of the query, which would be negligible relative to a long running query.

    The whole point of the WHERE clause looking like:

    WHERE (0=@pUseCondition

    or (1=@pUseCondition and d1.[WeekDay]='mon')

    or (2=@pUseCondition and d1.[WeekDay]='tue')

    or (3=@pUseCondition and d1.[WeekDay]='wed')

    is that if for example @pUseCondition==1, then the above query will run as if:

    WHERE (0=@pUseCondition

    or (1=@pUseCondition and d1.[WeekDay]='mon')

    or (2=@pUseCondition and d1.[WeekDay]='tue')

    or (3=@pUseCondition and d1.[WeekDay]='wed')

    So the question is, why on earth would that "perform much better than dynamic SQL" looking like?:

    WHERE d1.[WeekDay]='mon'

    Are you certain there weren't any flaws in your test approach? Such as:

    - Poor indexes

    - Failure to clear cache between tests.

    - Misjudging perceived improvement vs. actual improvement.

    - Errors in your dynamic SQL code.

    One strong possibility is that your query starts to return data sooner because it is forced to table-scan.

    Whereas dynamic SQL might produce an overall more efficient plan that uses intermediate result sets and delays only the start of returning data.

    So when you claim "performance is much better than with dynamic SQL", did you:

    - Check IO operations?

    - Check total execution time?

    - Ensure properly comparable test conditions?

    - Eliminate 'noise factors' like the overhead of transmitting large volumes of data over the network?

  • I concur Craig. Rigorous testing will ABSOLUTELY prove that the OR stuff is MUCH worse performing than a dynamic sql solution over the full variety of input values.

    I do note that on latest version of SQL Server an OPTION (RECOMPILE) stands a reasonable chance of getting the correct plan (i.e. index seeks on the correct index where appropriate), but to my knowledge it still isn't bullet proof, which dynamic SQL is. You simply cannot get a better chance at an optimal plan every time than hard-coded values in a minimalist WHERE clause!!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Sean: Thanks so much for the link to Gail's page. I found it extremely helpful.

  • JJ B (4/18/2013)


    Sean: Thanks so much for the link to Gail's page. I found it extremely helpful.

    You are welcome. You should check some of her other posts too. She has some great posts on there.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • For a parameterized order by clause:

    CREATE PROCEDURE sp_ParameterizedOrderBy @orderby VARCHAR(25)

    AS

    BEGIN

    SELECT *

    FROM information_schema.columns

    ORDER BY CASE @orderby

    WHEN 'column'

    THEN column_name

    WHEN 'table'

    THEN table_name

    ELSE NULL

    END

    END

    GO

    EXEC sp_ParameterizedOrderBy 'column'

  • For an example of parameterized orderby clause:

    CREATE PROCEDURE sp_ParameterizedOrderBy @orderby VARCHAR(25)

    AS

    BEGIN

    SELECT *

    FROM information_schema.columns

    ORDER BY CASE @orderby

    WHEN 'column'

    THEN column_name

    WHEN 'table'

    THEN table_name

    ELSE NULL

    END

    END

    GO

    EXEC sp_ParameterizedOrderBy 'column'

  • Regarding the Dynamic-SQL vs Conditional-WHERE discussion there is 1 thing nobody seems to care about : security. Not in the sense of sql-injection, but in the sense of the required permissions on the objects (tables) being used.

    => When you EXEC (@sql) inside a stored procedure, this pretty much acts the same as if the user who called the stored procedure had executed the code in @sql himself. This means he needs to have relevant rights on the underlying tables which is something we might rather not want and one of the (main?) reasons why we have stored procedures. (**)

    Also, you can't use dynamic sql in functions... (not that I use them a lot, but hey, maybe some-one does =)

    In all honesty, I'm convinced too that (decent) dynamic-sql will out-perform the construction at hand, but that doesn't mean I'm against it, on the contrary! Unless performance is a real issue, I try to stay clear of dynamic-sql because writing dynamic code in SQL always looks ugly, there are the security issues to consider and in the end you'll by typing (and maintaining) a lot more code to achieve the same thing.

    My 2 cents.

    (**: yes, I know you can use EXEC AS for quite a while now, but not all of us are lucky enough to have their clients keeping up with versions : if it works, it works; why pay a gazillion to get a newer version?)

  • deroby (4/19/2013)


    Regarding the Dynamic-SQL vs Conditional-WHERE discussion there is 1 thing nobody seems to care about : security. Not in the sense of sql-injection, but in the sense of the required permissions on the objects (tables) being used.

    => When you EXEC (@sql) inside a stored procedure, this pretty much acts the same as if the user who called the stored procedure had executed the code in @sql himself. This means he needs to have relevant rights on the underlying tables which is something we might rather not want and one of the (main?) reasons why we have stored procedures. (**)

    Also, you can't use dynamic sql in functions... (not that I use them a lot, but hey, maybe some-one does =)

    In all honesty, I'm convinced too that (decent) dynamic-sql will out-perform the construction at hand, but that doesn't mean I'm against it, on the contrary! Unless performance is a real issue, I try to stay clear of dynamic-sql because writing dynamic code in SQL always looks ugly, there are the security issues to consider and in the end you'll by typing (and maintaining) a lot more code to achieve the same thing.

    My 2 cents.

    (**: yes, I know you can use EXEC AS for quite a while now, but not all of us are lucky enough to have their clients keeping up with versions : if it works, it works; why pay a gazillion to get a newer version?)

    Yes security can be an issue for the dynamic sql approach. However I don't agree with your assessment of "if it ain't broke don't upgrade it". EXEC AS has been available since 2005. If you have clients still on 2000 then there are so many other issues other than losing out on EXEC AS. If you are stuck with 2k then use the approach outlined in this article. The performance won't be a big deal because the users are expecting everything to be slow anyway. 😛

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • buyme92 (4/19/2013)


    input this URL:

    spam reported

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • I agree! That technique is very powerful but it can also very quickly expose you to the risk of having SQL Server cache wrong execution plan (because of "parameters sniffing"). So if you plan to use it on large volume tables, I would advise that you carefully review the distribution of the resultsets you may get in return when varying input parameters. If they change widely or would benefit from different indexes, you may experience random performance issues.

  • There is also the option of using a table type parameter as a container for variable and/variable length parameter lists.

    Build a type containing name - value pairs, populate it in code, and you can simply join directly to it something like this.

    SELECT

    a.Field1,

    a.Field2

    FROM TableA a

    JOIN TableParameter p0 ON p0.ParamName = 'Field3'

    AND a.Field3 = ISNULL(p0.ParamValue, a.Field3)

    JOIN TableParameter p1 ON p1.ParamName = 'Field4'

    AND a.Field4 = ISNULL(p1.ParamValue, a.Field4);

    Unless we're also talking about a large paginated data set being returned LINQ has made it much more convenient in many cases to handle sorting once the data is returned from SQL rather than jumping through hoops in the query to accommodate that.

Viewing 15 posts - 31 through 45 (of 79 total)

You must be logged in to reply to this topic. Login to reply