The Cost of Function Use In A Where Clause

  • Nice article! I may have to take another look at some of my queries....

    So, something like this:

    Select

    *

    FROM

    CadencedEventCustomer

    WHERE

    ((FullName1 = 'Ed Jones') OR (FullName1 IS NULL))

    is better than using COALESCE, even with the OR statements?

    Select

    *

    FROM

    CadencedEventCustomer

    WHERE

    COALESCE(FullName1,'Ed Jones') = 'Ed Jones'

    ?

  • smonarch (2/28/2008)


    Why not create an index computed column where you can provide this capability for routinely used functions.

    Ummmm.... not sure what you mean by an "index computed column".... how does that differ from what GSquared offered up?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/28/2008)


    smonarch (2/28/2008)


    Why not create an index computed column where you can provide this capability for routinely used functions.

    Ummmm.... not sure what you mean by an "index computed column".... how does that differ from what GSquared offered up?

    I'm thinking he means that if you persist the computed column, you can then include it in the index being used to optimized the query. saves all of the bookmark lookups...

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I can't duplicate the effect documented in the article. In earlier versions of SQL Server, it was well know that the definition of a SARG did not include an expression that included a function. But my experience with recent versions of SQL Server is that this is no longer the case- a value wrapped inside a function will be used by SQL Server for an index seek. I verified this again today with 2005 but IIRC it was also true of at least 2000 as well. There may be some kind of selectivity issue when the incidence of null values is very high for the indexed column. Each null value would presumably have to be evaluated by the function, and maybe the optimizer would realize that a seek would be inefficient, but in the case I checked, where there were no null values in the column, wrapping the column in an ISNULL function produced the exact same exection plan as no function. I also tried another test wrapping an identity column SARG in a CEILING function. Same thing. I think the bottom line is: check your execution plan, make no assumptions.

  • First off, Nice Article. Very well articulated intro to issues with using functions in the Where clause.

    In reference to Peter Tirrell's question. Yes the Or should provide much better results than the Coalesce for the same reason as it will provide better results than the isNULL Function. It will allow an index seek instead of a scan. Granted, on tables with a very small number of rows, the optimizer may choose a scan vs. an index seek so it wouldn't matter, but 99.9% of the time the OR will be more efficient.

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • RIck Moudy (2/28/2008)


    If you convert your @varDate local to a smalldatetime local before using it in the SELECT statement you can do a straight comparison of the local variable and the table column.

    INSTEAD OF:

    select * from table_name where dateDiff(d,@varDate,Created_Date) =0

    TRY THIS:

    DECLARE @sdtDate smalldatetime

    SELECT @sdtDate = cast(@varDate AS smalldatetime)

    SELECT * FROM table_name WHERE Created_Date = @sdtDate

    Am I missing something? Why don't just use simply:

    SELECT * FROM table_name WHERE Created_Date = cast(@varDate AS smalldatetime)

    if @varDate worked in datediff will work in directly in cast as well. The column is Create_Date which is not wrapped and the variable doesn't mattter if it is, that is the value we are looking for.

    Don't just give the hungry man a fish, teach him how to catch it as well.

    the sqlist

  • Matt Miller (2/28/2008)


    I'm thinking he means that if you persist the computed column, you can then include it in the index being used to optimized the query. saves all of the bookmark lookups...

    Computed column can be included in an index without persisting it. That's the beauty of computed columns.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Matt Miller (2/28/2008)


    Jeff Moden (2/28/2008)


    smonarch (2/28/2008)


    Why not create an index computed column where you can provide this capability for routinely used functions.

    Ummmm.... not sure what you mean by an "index computed column".... how does that differ from what GSquared offered up?

    I'm thinking he means that if you persist the computed column, you can then include it in the index being used to optimized the query. saves all of the bookmark lookups...

    Thats what I was thinking Jeff. I'm brand new with SQL Server (couple days) but see this as the equivalent to a function based index. I agree however it would be best to eliminate the function in the where clause if possible. But if not then...

  • Jeff Moden (2/28/2008)


    David Jackson (2/28/2008)


    This looks complex but will use your index.

    where CreatedDate between dateadd(dd, datediff(dd,0,Getdate()),0) --midnight TODAY

    and dateAdd(ss,-1,dateAdd(dd,1,dateadd(dd, datediff(dd,0,Getdate()),0))) --23:59:59 TODAY

    ....

    Even then, there is the possibility of something slipping in during that 1 second you've ignored.

    Absolutely true, I have never thought hard about the implications of the word BETWEEN 😀

    Having said that, most if not all of the processing where I work, both here and where I have been in the past do not process OLTP transactions at or around midnight.

    Then again, I have a batch job that kicks off at 7:00 pm that will cause problems if it runs over midnight, so we stop it if it still running at 10:00 pm and investigate!

    And, with the new date types coming out for 2k8, it becomes even more important to not ignore that 1 second.

    Still stuck on 2000 unfortunately, so I'm thinking about skipping 2005 altogether when we do get round to doing an upgrade...

    Input, as ever, heeded.

    Cheers Jeff! 🙂

    Dave J


    http://glossopian.co.uk/
    "I don't know what I don't know."

  • Good artible, however a full table scan isn't being performed, but an Index Scan is which is different to a table scan...

  • Thanks very much....

    Yes, these are very common mistakes.

    We had Hundreds of these occurances throughout our DB and the performance suffered immensely.

    Took a few weeks to clean these all up and the impact was quite dramatic.

    GAJ

    Gregory A Jackson MBA, CSM

  • hmmm.....On our db, these resulted in scans 100% of the time and removing the functions (isNull Usually) fixed the issue 100% of the time.

    I'd have to see specific examples in a specific DB to understand the differing behavior.

    good point about never assume anything ALWAYS Test and check the query plans....

    Cheers,

    GAJ

    Gregory A Jackson MBA, CSM

  • Well...

    After seeing so many suggestions here about improving querries, let me try to Sum up the Articele in one sentence...

    DONT put a field of a table into ANY function. This will void all indexes on this field, and force an Index Scan (Which is ALMOST as bad as a table scan...)

    There where some questions asking whats the performance benefit of it?

    It depends on your Table and the amount of data in it. If you add a table field into a function, it must be checked for EVERY unique entry in your index... And with a table containing millions of rows, or more... That can seriously slow you down... We are talking several orders of magnitude here

  • hmmm.....On our db, these resulted in scans 100% of the time and removing the functions (isNull Usually) fixed the issue 100% of the time.

    I'd have to see specific examples in a specific DB to understand the differing behavior.

    Maybe try it with some other types of functions. My hypothesis is that the isnull behavior may be related to the frequency of null values in the column, but it could be something else. Nevertheless, I verified that it is somewhat more complicated than stated in the article. I definitely got index seeks on SARGs that included a function.

  • I liked the article, but I don't seem to have the problem with my queries that you are describing when using functions like IsNull in the where clause.

    Im running SQL Server 2005. I created a Table to test this with, [Table1] with a non-clustered index on [Column1] (no included columns on that index). It is important to say that my table has a clustered index on it, but not on Column1. The following queries both have identical execution plans.

    Select *

    From Table1

    Where IsNull(Column1,'') = ''

    Select *

    From Table1

    Where Column1 = '' or Column1 is null

    ...and both utilize my index on Column1.

    That being MY case (though my case may be one of the "not always as expected cases")....concerning the post about the Created_Date index. Is it possible that your table does not have a clustered index, thus a HEAP, or that your statistics are out of date? You may not want to rule anything out until you've tried it.

Viewing 15 posts - 31 through 45 (of 98 total)

You must be logged in to reply to this topic. Login to reply