Conditional WHERE Clauses and Boolean Algebra

  • Comments posted to this topic are about the item Conditional WHERE Clauses and Boolean Algebra

  • Always good to see a firm grasp of logic being used in computing!

    If I may suggest a variation to this technique I have seen is to use ISNULL or COALESCE around the parameter

    ALTER PROCEDURE [spSearchPeople]

    @firstname nvarchar(50) = null,

    @lastname nvarchar(50) = null

    AS

    BEGIN

    SELECT ID, FirstName, LastName FROM People

    WHERE

    FirstName LIKE COALESCE('%' + @firstname + '%', FirstName)

    END

    If @firstname is NULL then the % + @firstname + % becomes NULL, so the table self matches

    This should mean less confusion over indices and better query plan caching as the optimiser can determine that the column we are interested in is FirstName

    The major drawback is it is less intuitive and probably T-SQL specific

  • I have to agree that this is a very elegant solution, but can be very bad for performance.

    Imagine you have a table with 1 million names in it, and you have indexes on first name and on last name.

    If you query that table by first name, SQL will use the first name index.

    If you query that table by last name, SQL will use the last name index.

    If you use the boolean logic, SQL will likely read the whole table because it won't know which one to use. Even if SQL is smart enough to choose one index, it will save that query plan for every execution, and if the conditions change (i.e. first time read by first name and now is selecting by last name) it will use the wrong index.

    For cases like this dynamic SQL is usually better, since SQL will have a differnt plan for each type of search.

    Please note that this is not an issue in example case because the name in this case is always between '%' , and SQL would not use an index anyway.

  • Sad to think that SQL would fail to optimise when both FirstName and LastName are provided. I would expect it to use one Index and subselect the results.

    Alternatively it could resolve on both indexes simultaneously and inner join the results. Effectively, create a virtual table of Customers that match first name, a second virtual table of Customers that match the second name and then INNER JOIN the two virtual tables.

    As with all these things, if you are prepared to put in the hours there are usually 42 ways to solve the problem. 😉

    And yes, I'd forgotten the leading %. Thanks for the reminder

  • Please don't get me wrong. If SQL finds that either first name or last name with the parameters passed will give you a good plan, it will use that index.

    The problem is that now your query has a plan that uses that index (let's say it used last name).

    The next time you call this same procedure and supply a first name and NULL for last name, the query plan is ready and SQL will not create a new one.

    So in our example SQL will read the whole table using the last name index to find people by first name, which is even worse than doing a full table scan 🙂

  • I've seen this construct used many times, and typically it is a little poor on performance. An index on First and / or Last Name will usually be used, for sure, but it will tend to be an index scan, rather than a seek. I've tried many variations to get this to change, but with no joy as yet.

  • Very good points at the start of the article and I may do some refresher readin on some bits but for the specific example would something like this not be better?

    CREATE PROCEDURE [spSearchPeople]

    @firstname nvarchar(50) = '',

    @lastname nvarchar(50) = ''

    AS

    BEGIN

    SELECT ID, FirstName, LastName FROM People

    WHERE FirstName LIKE '%' + @firstname + '%'

    OR LastName LIKE '%' + @lastname + '%'

    END

    [font="Arial"]_______________________________________________________
    Change is inevitable... Except from a vending machine.[/font]

  • Bad for performance in general is my experience with using this type of sequence, even in some smaller tables and particularly when i used alot of parameters with this same sequence.

    I am also curious as to why its posted as an article being such a minor and commonly used thing. i.e. i could see just as much use in writing an article teaching people how to write select * from tablename. But thats just me...

  • Very nice and pragmatic article.

    I also like the solution proposed by Toby Harman, though I'm not able to determine the impact on performance.

    I propose another solution, the one that I use most :hehe: :

    ALTER PROCEDURE [spSearchPeople]

    @firstname nvarchar(50) = null,

    @lastname nvarchar(50) = null

    AS

    BEGIN

    SELECT ID, FirstName, LastName FROM People

    WHERE

    FirstName LIKE '%' + ISNULL(@firstname, FirstName) + '%'

    END

    obviously it also works perfectly with equal where conditions 😛

    ALTER PROCEDURE [spSearchPeople]

    @firstname nvarchar(50) = null,

    @lastname nvarchar(50) = null

    AS

    BEGIN

    SELECT ID, FirstName, LastName FROM People

    WHERE

    ISNULL(@firstname, FirstName) = FirstName

    END

    Cheers

  • This is a pretty poor example of how to do this imho.

    If you wish to do the dreaded "like" you should have two sets of params, one set when you know the whole first / last name and one for partial first / last names.

    This way indexes get used properly and you dont have to use like when you know the whole names etc

    Coalesce should also be used to to remove the reams of or / and where clauses

  • GOSH!

    :w00t:

    it's the SAME solution proposed by Toby Harman!!

    I didn't read he also suggested the use of ISNULL around the parameter...

    Sorry!

    😛

  • 😀

  • I didn't use ISNULL or COALESCE becuase i defaulted the parameters to '' instead of null to remove the need.

    Of course you have to ensure that the front end either does not include a parameter if blank or passes and empty string instead of null.

    [font="Arial"]_______________________________________________________
    Change is inevitable... Except from a vending machine.[/font]

  • Its fine on something small, but absolutely appalling performance wise on bigger tables. GHY if you have a set of joins on big tables and you try to use this approach, cos the optimiser sure won't...

  • Not only does this methodology almost always lead to poor performance in whatever query you are writing be it an Index Scan or Table Scan, this methodology breaks some of the core principals of code separation. You will lose on cache hits since there could ultimately be multiple plans generated from the variety of parameters passed in.

    Every procedure should have a single purpose. There is nothing worse than trying to figure out why a procedure performs well sometimes but then performs poorly on other scenarios.

Viewing 15 posts - 1 through 15 (of 60 total)

You must be logged in to reply to this topic. Login to reply