Conditional WHERE Clauses and Boolean Algebra

  • This concept was discussed extensively in Erland Sommarskog's "Dynamic Search Conditions" http://www.sommarskog.se/dyn-search-2005.html which is a pretty famous article. You should at least link to it in your main article. 🙂

    One of the previous comments was correct in index use; if I wanted to use a Like condition or not use a Like condition, I would use an If clause around the whole thing:

    If <need to use Like>

    Select <Columns> From Table Where <Column> Like '%' + @String + '%'

    Else

    Select <Columns> From Table

    End If

    See Erland Sommarskog's discussion in the referenced link, about "x = @x AND @x IS NOT NULL" that says this syntax might cause SQL to "add[] a filter with a startup expression to the plan" which will help performance.

  • I was interested to know what this actually does in SQL 2008...

    So I created this...

    drop procedure dbo.USR_TEST_SEARCH

    GO

    create procedure dbo.USR_TEST_SEARCH

    @fname_search varchar(40) = null,

    @lname_search varchar(40) = null

    as

    select FIRST_NAME, LAST_NAME, MASTER_CUSTOMER_ID

    from dbo.CUSTOMER

    where (@fname_search is null OR FIRST_NAME like '%'+@fname_search+'%')

    AND (@lname_search is null OR LAST_NAME like '%'+@lname_search+'%')

    GO

    I have a table here with 271,000 customers, so I did some tests on that. Note that this is against real data, in an active database, in the middle of the day.

    Run_times

    no params: 2283ms (an index scan was done, because I have an index with the three result columns.)

    with fname only = "jim": 173ms (2100 rows found) again, an index scan is used

    with both names: 170ms (5 rows found) also an index scan

    So that was pretty good... but then I devised a way to show what I said above... I added a third search field...

    create procedure dbo.USR_TEST_SEARCH

    @fname_search varchar(40) = null,

    @lname_search varchar(40) = null,

    @class_search varchar(24) = null

    as

    select FIRST_NAME, LAST_NAME, MASTER_CUSTOMER_ID

    from dbo.CUSTOMER

    where (@fname_search is null OR FIRST_NAME like '%'+@fname_search+'%')

    AND (@lname_search is null OR LAST_NAME like '%'+@lname_search+'%')

    AND (@class_search is null OR CUSTOMER_CLASS_CODE = @class_search)

    GO

    Now the performance is horrible, even when you only use the first name in the search. The 170 above went to 313, almost double the time... and the query plan became complicated with nested loops and one index seek and one index scan. NOT IDEAL when you aren't searching with the customer class parameter.

  • roger.plowman (8/31/2010)


    I'd think (from a performance standpoint) using LIKE is the *last* thing you'd ever want to do. Especially the "Like %X%" construction, since at best it's going to do an index scan.

    "LIKE X%" performs adequately since it lets the query optimizer at least limit the scan to a subset of the index, but both "LIKE %X%" and "LIKE %X" are horrible performance killers.

    Is there any way around this dilemma? I only add the LIKE syntax dynamically if the user passed data for that parameter, but if this is bad regardless, I'd like to do this correctly if possible. I use "LIKE '%X%'" because the user requested the ablility to search on any portion of the content. I hope this is not too much of a tangent question!

  • LSAdvantage (8/31/2010)


    roger.plowman (8/31/2010)


    I'd think (from a performance standpoint) using LIKE is the *last* thing you'd ever want to do. Especially the "Like %X%" construction, since at best it's going to do an index scan.

    "LIKE X%" performs adequately since it lets the query optimizer at least limit the scan to a subset of the index, but both "LIKE %X%" and "LIKE %X" are horrible performance killers.

    Is there any way around this dilemma? I only add the LIKE syntax dynamically if the user passed data for that parameter, but if this is bad regardless, I'd like to do this correctly if possible. I use "LIKE '%X%'" because the user requested the ablility to search on any portion of the content. I hope this is not too much of a tangent question!

    No there really isn't... since you're searching for a substring, SQL Server has to look at each column value explicitly and search it for the substring. There is no way for it to know if the substring is contained in the column, without actually reading the column. If you leave off the first %, then it can use an index, but sometimes it still doesn't. I find that most of the time when users say they want to search for substrings, really they don't want to search the middle of words. So I would be looking for clarification of the requirements, and making it clear that there is a performance issue, if they want to search the middle of words.

  • Very good point: "So I would be looking for clarification of the requirements, and making it clear that there is a performance issue, if they want to search the middle of words."

    Thank you Jasmine.

    Lisa

  • LSAdvantage (8/31/2010)


    Very good point: "So I would be looking for clarification of the requirements, and making it clear that there is a performance issue, if they want to search the middle of words."

    Thank you Jasmine.

    Lisa

    Exactly.

    On the bright side, with a dynamic query, if they don't search on that one option, it won't do it.

    I've actually got dynamic search queries that change the joins if specific text values aren't part of the search.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Why not use sp_ExecuteSQL and build up the query that is needed in T-SQL code. This will cache the execution plan for the query statement.

    The other thing I would suggest is that if you are going to using multiple LIKE '%[some string]% then you consider using full text indexes.

  • To everyone who recommended the articles on dynamic SQL, thank-you. That is an elegant solution which I had missed due to my hard-wired "Dynamic SQL is Bad" thinking.

    I also had a play around with using CTEs to evaluate each condition at the start and then join the results back together, but I haven't been able to get onto a decent sized data set to play with the idea.

    The example below assumes a unique key (column name is "Id") exists on the Person table which is not related to their first and last names

    WITH

    FN AS (

    SELECT

    *

    FROM Person

    WHERE FirstName LIKE COALESCE(@l_FirstName, LastName)

    )

    , LN AS (

    SELECT

    *

    FROM Person

    WHERE LastName LIKE COALESCE(@l_LastName, LastName)

    )

    SELECT

    [FN].*

    FROM FN

    INNER JOIN LN

    ON FN.Id = LN.Id

    Anyone care to test this on a sufficiently non-trivial data set?

    Edit: And yes, I'm sure that full-text indexes would simplify the situation. However, the original article (thanks again for posting Tony) was more about boolean algebra and I was trying to offer an alternative structure for that.

  • I ran into performance issues caused by parameter sniffing so many times using this type of construct

    at work that I had to stop using it.

    Now I use this:

    declare @local_firstname nvarchar(50)

    declare @local_lastname nvarchar(50)

    select @local_firstname = isnull(@firstname,''), @local_lastname = isnull(@lastname,'')

    SELECT ID, FirstName, LastName FROM People

    WHERE

    FirstName LIKE '%' + @local_firstname + '%'

    and

    LastName LIKE '%' + @local_lastname + '%'

    It's more work up front but:

    a) makes the query easier to understand (and therefore maintain) and

    b) avoids slow queries caused by the use of inappropriate query plans via parameter sniffing

    Just my 2c

  • use fulltext search. its fast, its adaptable and its simple

  • I just used the example given, I have yet to use a like clause in any of my code, the focus of the comment is on the performance issues around the use of optional paramaters in a stored procedure.

    hth

  • Thanks for the article.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • David.Poole (8/31/2010)


    Why not use sp_ExecuteSQL and build up the query that is needed in T-SQL code. This will cache the execution plan for the query statement.

    The other thing I would suggest is that if you are going to using multiple LIKE '%[some string]% then you consider using full text indexes.

    Unfortunately Full Text Indexing doesn't handle middle of a string search like LIKE does.

    It seems to only handle the LIKE 'sometext%' type searches... and not inner text type searches.

    Or at least that's what it did in SQL 2005 and earlier.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • UncleJimBob (8/31/2010)


    I ran into performance issues caused by parameter sniffing so many times using this type of construct

    at work that I had to stop using it.

    TRACEFLAG 4136 will be your friend.

    That or simply redeclaring local varaibles for each parameter and setting the local variable to the parameter.

    either will get you around Microsoft's silly new Parameter sniffing feature that actually hurts performance more than it helps.



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • That or simply redeclaring local varaibles for each parameter and setting the local variable to the parameter.

    Yes, thats exactly what I do 🙂

Viewing 15 posts - 46 through 60 (of 60 total)

You must be logged in to reply to this topic. Login to reply