Conditional WHERE Clauses and Boolean Algebra

  • FYI SSC-Enthusiast is just a Forum Tag...not his/her handle.

  • Hello,

    Lots of posts here about efficiency and COALESCE and ISNULL. The points are well taken, though that really wasn't the point of this little article.

    The point was more meant to be the importance of understanding that there are almost always non-procedural ways to accomplish a query task, and that those solutions are generally more elegant in form.

    I also personally think it is good for any query writer to know how to think purely mathematically, before using any syntax specially provided by the query language, such as COALESCE and ISNULL. I've seen way too many simply copy-and-paste these techniques without really understanding what was going on. Of course - if you have understanding, then by all means run with it!

    As far as efficiency - in 14 years I've always found this technique to be acceptable in the realm of efficiency and (especially in the old days) was given weight over closing the potential security holes of dynamic SQL.

    Of course, every scenario is different and I'm sure there are scenarios in which a different approach is preferable (there always are).

    Again - I wanted to post something stressing an appreciation of fundamentalism over T-SQL in the learning process, I've always found you start with a mathematically elegant solution and can then generally increase its efficiency by taking advantage of whatever specialties the DBMS you are working in provides.

    Thanks for the posts,

    Tony

  • And I think you did a great job Tony.

    [This is a paraphrase from a Teddy Roosevelt quote I have read somewhere here on SSC]

    -- The glory belongs to those that are in the game not in the bleachers talking about the game.

    <><
    Livin' down on the cube farm. Left, left, then a right.

  • I don't claim to be an expert on the subject, but when faced with a query like this:

    SELECT field

    FROM table

    WHERE (field LIKE '%A%') OR (field LIKE '%B%')

    I find I get much better performance if I do this...

    SELECT field

    FROM table

    WHERE (field LIKE '%A%')

    UNION

    SELECT field

    FROM table

    WHERE (field LIKE '%B%')

    INTERSECT and EXCEPT can also be used, along with the judicious application of parentheses, to get the results you desire. It's worked quite well for me on a number of occasions, though I've never tried it with huge data sets.

    ron

    -----
    a haiku...

    NULL is not zero
    NULL is not an empty string
    NULL is the unknown

  • Hello together,

    Despite it is a good idea using SQL in its pure descriptive nature, because it normally leads to generally better performance, one has to be aware of caveats in assuming that things are evaluated in an "intuitive order".

    Neither the relatinonal nor the boolean algebra does explicitly defines an evaluation order.

    In regard to this article it is assumed that the boolean subexpressions will be evaluated from right to left in an fail-fast-manner. Although this could be implemented like that it does not need to. Thus leading to unportable and/or bad performing queries. Just imagine that the default values for the positional parameters would have been changed to the empty string and the evaluation would be left-to-right. The query will do always a full scan over the possibly huge table with returning unwanted tuples.

    Therefore one should never assume the natural order in SQL-Queries as long as These are not guaranteered by the DBS.

  • Rob Fisk (8/31/2010)


    Very good points at the start of the article and I may do some refresher readin on some bits but for the specific example would something like this not be better?

    CREATE PROCEDURE [spSearchPeople]

    @firstname nvarchar(50) = '',

    @lastname nvarchar(50) = ''

    AS

    BEGIN

    SELECT ID, FirstName, LastName FROM People

    WHERE FirstName LIKE '%' + @firstname + '%'

    OR LastName LIKE '%' + @lastname + '%'

    END

    This approach does not correctly answer the search question when @firstname and @lastname are not null.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Let me start by saying that I quite agree with the initial point of the article. When confronted with a complex SQL problem you are better off thinking in Venn diagrams than logic trees. Now into the meat of the discussion:

    There is a way to avoid dynamic SQL while still caching plans correctly, although most of the discussion above about cached plans ignores that the names are unqualified.

    (simplified code)

    [font="Courier New"]IF @FirstName IS NULL AND @LastName IS NULL

    SELECT ID, FirstName, LastName FROM MyDB.dbo.People

    ELSE IF @FirstName IS NULL AND @LastName IS NOT NULL

    SELECT ID, FirstName, LastName FROM MyDB.dbo.People

    WHERE LastName LIKE @LastName -- I usually let the caller supply wildcards

    ELSE IF @FirstName IS NOT NULL AND @LastName IS NULL

    SELECT ID, FirstName, LastName FROM MyDB.dbo.People

    WHERE FirstName LIKE @FirstName -- I usually let the caller supply wildcards

    ELSE -- NEITHER ARE NULL

    SELECT ID, FirstName, LastName FROM MyDB.dbo.People

    WHERE FirstName LIKE @FirstName

    AND LastName LIKE @LastName [/font]

    NOTE: Normally I use BEGIN and END for every conditional, even if there is only a single statement

    In the past I have also checked for wild cards in the parameter and used separate queries with like and equal, but that is beyond the scope of this discussion.

    So is that using sp_executesql and passing in parameters protects against most SQL injection.

  • opc.three (8/31/2010)


    Rob Fisk (8/31/2010)


    Very good points at the start of the article and I may do some refresher readin on some bits but for the specific example would something like this not be better?

    CREATE PROCEDURE [spSearchPeople]

    @firstname nvarchar(50) = '',

    @lastname nvarchar(50) = ''

    AS

    BEGIN

    SELECT ID, FirstName, LastName FROM People

    WHERE FirstName LIKE '%' + @firstname + '%'

    OR LastName LIKE '%' + @lastname + '%'

    END

    This approach does not correctly answer the search question when @firstname and @lastname are not null.

    The parameters are defaulted to an empty string rather than null.

    OK, there is the front end caveat that you don't pass a parameter unless needed but given that there is no need to use ISNULL within the query.

    [font="Arial"]_______________________________________________________
    Change is inevitable... Except from a vending machine.[/font]

  • For those that have read this far, I'd like to suggest that you read this blog[/url] by Gail Shaw on working with this type of query (catch-all query).

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Toby Harman (8/30/2010)


    Always good to see a firm grasp of logic being used in computing!

    If I may suggest a variation to this technique I have seen is to use ISNULL or COALESCE around the parameter

    ALTER PROCEDURE [spSearchPeople]

    @firstname nvarchar(50) = null,

    @lastname nvarchar(50) = null

    AS

    BEGIN

    SELECT ID, FirstName, LastName FROM People

    WHERE

    FirstName LIKE COALESCE('%' + @firstname + '%', FirstName)

    END

    If @firstname is NULL then the % + @firstname + % becomes NULL, so the table self matches

    This should mean less confusion over indices and better query plan caching as the optimiser can determine that the column we are interested in is FirstName

    The major drawback is it is less intuitive and probably T-SQL specific

    The solution making use of COALESCE is more readable in my opinion however it is not logically equivalent to the solution presented in the article when the column being searched allows NULL. The article does not specify the DDL of the table being searched however many natural searches do go against columns that allow NULL, e.g. name parts, address parts, phone numbers, emails, etc.

    NULL != NULL so unlike in the article when @firstname IS NULL the search criteria for First Name is short=circuited enabling columns with a NULL First Name to be returned, the solution making use of COALESCE will attempt to compare a NULL value in the table with itself resulting in that row being excluded from the resultset. Here is a re-runnable example demonstrating the point:

    IF OBJECT_ID(N'tempdb..#addy') IS NOT NULL

    DROP TABLE #addy

    CREATE TABLE #addy

    (

    id INT,

    firstname VARCHAR(20) NULL,

    lastname VARCHAR(20) NULL

    ) ;

    INSERT INTO #addy (id, firstname, lastname) VALUES (1,'Rick','Barnes')

    INSERT INTO #addy (id, firstname, lastname) VALUES (2,'Rick',NULL)

    DECLARE @firstname VARCHAR(20),

    @lastname VARCHAR(20)

    -- we are searching for all names with 'ic' in the first name

    SELECT @firstname = 'ic',

    @lastname = NULL

    -- row with NULL lastname is excluded

    SELECT *

    FROM #addy

    WHERE firstname LIKE COALESCE('%' + @firstname + '%', firstname)

    AND lastname LIKE COALESCE('%' + @lastname + '%', lastname) -- NULL != NULL

    -- row with NULL lastname is included

    SELECT *

    FROM #addy

    WHERE (@firstname IS NULL OR firstname LIKE '%' + @firstname + '%')

    AND (@lastname IS NULL OR lastname LIKE '%' + @lastname + '%')

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Rob Fisk (8/31/2010)


    opc.three (8/31/2010)


    Rob Fisk (8/31/2010)


    Very good points at the start of the article and I may do some refresher readin on some bits but for the specific example would something like this not be better?

    CREATE PROCEDURE [spSearchPeople]

    @firstname nvarchar(50) = '',

    @lastname nvarchar(50) = ''

    AS

    BEGIN

    SELECT ID, FirstName, LastName FROM People

    WHERE FirstName LIKE '%' + @firstname + '%'

    OR LastName LIKE '%' + @lastname + '%'

    END

    This approach does not correctly answer the search question when @firstname and @lastname are not null.

    The parameters are defaulted to an empty string rather than null.

    OK, there is the front end caveat that you don't pass a parameter unless needed but given that there is no need to use ISNULL within the query.

    The problem is not with the default value of the parameters it is with the boolean logic. If someone provide Jim for @firstname and Thorpe for @lastname your query will incorrectly return people with the name Jim Smith, Jim Williams, etc.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • I'd think (from a performance standpoint) using LIKE is the *last* thing you'd ever want to do. Especially the "Like %X%" construction, since at best it's going to do an index scan.

    "LIKE X%" performs adequately since it lets the query optimizer at least limit the scan to a subset of the index, but both "LIKE %X%" and "LIKE %X" are horrible performance killers.

  • Gail Shaw posted a blog entry with several tests on performance for these types of "catch all" queries.

    Catch-all queries[/url]

    (Sorry WayneS, I missed your post.)

  • Here is another link from Gail Shaw relevant to this thread: http://sqlinthewild.co.za/index.php/2009/09/15/multiple-execution-paths/

    People, if you use this type of construct either do it with one of the 3 mechanisms covered by Gail or take my business card because performance will be "suboptimal". 😀

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • It is nice to see people thinking outside the box. This was a short but nice article. As a sidenote another way of conditionally peforming a where is to utilize Not Null on the variables. I do this with queries that pass a multitude of parameters which may or may not have data in them such as:

    Create procedure dbo.some_Procedure_Name

    @var1 int Null

    , @var2 char(3) null

    , @var3 char(2) null

    as

    Select field1, field2, field3

    from TableName

    where

    (@var1 is not null and Field1 = @var1)

    and

    (@var2 is not null and field2 like '%' + @var2)

    and

    (@var3 is not null and Field3 = @var3)

    Now, we could Or these conditions together Or mix and and or depending on the desire results. As the author points out, this type of approach can get rather unwieldly very fast!

    I like figuring this stuff out, makes the day go by faster! 🙂

    Cheers - Dave

Viewing 15 posts - 16 through 30 (of 60 total)

You must be logged in to reply to this topic. Login to reply