How bad this in terms of performance; udf in WHERE

  • Hi, all

    I heard that this is not a good way to apply udf in WHERE, but not sure is it evaluated each time for each row or just once and is OK in terms of performance: something like below

    --

    declare @Month int;

    SELECT

    C1, C2. C3

    FROM TABLE1

    WHERE MONTH(CDate) BETWEEN UdfPrefMOnth(@month,-1) and udfPrevMonth(@month,-2)

    Thanks

    Mario

  • The UDF isn't the problem here, the MONTH(CDate) is most likely to impact performance as it's non-SARGable.

    It should be easily possible to generate a from and to date rather than a from and to month and remove the function around CDate...

  • You don't want a function (built-in or UDF) on the left side of the operator (in this case, the =) because the function will have to be evaluated for each row in the table. Like Howard said, you're going to want to change it to something like this:

    SELECT C1, C2. C3

    FROM TABLE1

    WHERE CDate BETWEEN starting_date AND ending_date;

    The starting_date and ending_date can be date calculations, UDFs or variables, but the key is to look at the column by itself, not wrapped in a function. If it helps, Lynn Pettis has a page with some common date functions that many people find very useful. It is at http://www.sqlservercentral.com/blogs/lynnpettis/2009/03/25/some-common-date-routines/. While I don't know your UDF functions, I'm guessing you want something similar to this:

    SELECT C1, C2. C3

    FROM TABLE1

    WHERE CDate BETWEEN DATEADD(month, DATEDIFF(month, 0, GETDATE()) - 2, 0) AND

    DATEADD(month, DATEDIFF(month, 0, GETDATE()) - 1, 0);

  • Scalar (and multi-statement table valued) user defined functions can be DEVASTATINGLY BAD!!! There are multiple ways they can screw you. A short list includes:

    1) prevent optimizer from getting/using accurate statistics --> bad query plans and horrible performance and concurrency

    2) prevent parallelism in section(s) of query plan involving UDF

    3) can lead to RBAR (row-by-agonizing-row) processing --> unbelievably bad performance

    Please get a copy of SQL Server MVP Deep Dives 2 and read the chapter Death by UDF for more issues and coverage.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (9/3/2014)


    Scalar (and multi-statement table valued) user defined functions can be DEVASTATINGLY BAD!!! There are multiple ways they can screw you. A short list includes:

    1) prevent optimizer from getting/using accurate statistics --> bad query plans and horrible performance and concurrency

    2) prevent parallelism in section(s) of query plan involving UDF

    3) can lead to RBAR (row-by-agonizing-row) processing --> unbelievably bad performance

    Please get a copy of SQL Server MVP Deep Dives 2 and read the chapter Death by UDF for more issues and coverage.

    In fact, Kevin wrote that chapter and it's very good. If you haven't read the SQL Server MVP Deep Dives books, I'd highly recommend both of them. They're among the better SQL Server books I've read and they cover a variety of topics, so there's something in there for everyone.

  • Pretty much that whole WHERE clause is an issue. Functions on columns or functions to determine values within a WHERE clause, JOIN criteria or a HAVING clause will all impact performance negatively.

    Oh, and one note, it doesn't matter if the function is on the left or right of the comparison operator, it all causes problems:

    FUNCTION(MyColumn) = 'A'

    or

    'A' = FUNCTION(MyColumn)

    They're both bad.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (9/3/2014)


    Pretty much that whole WHERE clause is an issue. Functions on columns or functions to determine values within a WHERE clause, JOIN criteria or a HAVING clause will all impact performance negatively.

    Oh, and one note, it doesn't matter if the function is on the left or right of the comparison operator, it all causes problems:

    FUNCTION(MyColumn) = 'A'

    or

    'A' = FUNCTION(MyColumn)

    They're both bad.

    I'm not sure anyone was suggesting that those two things would be different. I think what Ed meant by "left" side was on the column rather than the variable - perhaps badly worded.

    I didn't think there was much of a performance overhead of:

    MyColumn = FUNCTION('A')

    Are you saying that this is also "bad"? I guess it could have some effect on statistics, but I thought that would occur after the parameter had been evaluated (and parameter sniffing issues would be largely the same as regular paramemters)?

  • HowardW (9/3/2014)


    Grant Fritchey (9/3/2014)


    Pretty much that whole WHERE clause is an issue. Functions on columns or functions to determine values within a WHERE clause, JOIN criteria or a HAVING clause will all impact performance negatively.

    Oh, and one note, it doesn't matter if the function is on the left or right of the comparison operator, it all causes problems:

    FUNCTION(MyColumn) = 'A'

    or

    'A' = FUNCTION(MyColumn)

    They're both bad.

    I'm not sure anyone was suggesting that those two things would be different. I think what Ed meant by "left" side was on the column rather than the variable - perhaps badly worded.

    I didn't think there was much of a performance overhead of:

    MyColumn = FUNCTION('A')

    Are you saying that this is also "bad"? I guess it could have some effect on statistics, but I thought that would occur after the parameter had been evaluated (and parameter sniffing issues would be largely the same as regular paramemters)?

    If your function is a UDF, then it's bad (I'm not sure if deterministic UDFs will be bad as well). I've seen large improvements when changing UDFs on variables or columns in the SELECT list to built-in functions.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (9/3/2014)


    If your function is a UDF, then it's bad (I'm not sure if deterministic UDFs will be bad as well). I've seen large improvements when changing UDFs on variables or columns in the SELECT list to built-in functions.

    In the SELECT list, yes. I'm specifically referring to UDFs around variables in the WHERE clause.

  • HowardW (9/3/2014)


    Grant Fritchey (9/3/2014)


    Pretty much that whole WHERE clause is an issue. Functions on columns or functions to determine values within a WHERE clause, JOIN criteria or a HAVING clause will all impact performance negatively.

    Oh, and one note, it doesn't matter if the function is on the left or right of the comparison operator, it all causes problems:

    FUNCTION(MyColumn) = 'A'

    or

    'A' = FUNCTION(MyColumn)

    They're both bad.

    I'm not sure anyone was suggesting that those two things would be different. I think what Ed meant by "left" side was on the column rather than the variable - perhaps badly worded.

    I didn't think there was much of a performance overhead of:

    MyColumn = FUNCTION('A')

    Are you saying that this is also "bad"? I guess it could have some effect on statistics, but I thought that would occur after the parameter had been evaluated (and parameter sniffing issues would be largely the same as regular paramemters)?

    It depends (you knew I was going there). While a function like that will not lead automatically to a scan in the same way as a function on the column will? No. It can prevent the optimizer from using specific values when checking statistics in the column resulting in differences in the execution plan. You're frequently better off doing those types of functions independently from WHERE/JOIN/HAVING. But as you say, it's a very different problem set.

    The "left side" statement comes up a lot and I think it leads to confusion.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • HowardW (9/3/2014)


    Luis Cazares (9/3/2014)


    If your function is a UDF, then it's bad (I'm not sure if deterministic UDFs will be bad as well). I've seen large improvements when changing UDFs on variables or columns in the SELECT list to built-in functions.

    In the SELECT list, yes. I'm specifically referring to UDFs around variables in the WHERE clause.

    Even if by happenstance you get a perfect (serial) plan, you still lose the option of a parallel plan, which can cause significantly longer executions. And this can come from more than just not having multiple threads available to chew through the data - you also lose some parallelism-requiring optimizations such as batch mode processing in column store index usage, bitmapping, etc.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Grant Fritchey (9/3/2014)


    Pretty much that whole WHERE clause is an issue. Functions on columns or functions to determine values within a WHERE clause, JOIN criteria or a HAVING clause will all impact performance negatively.

    Oh, and one note, it doesn't matter if the function is on the left or right of the comparison operator, it all causes problems:

    FUNCTION(MyColumn) = 'A'

    or

    'A' = FUNCTION(MyColumn)

    They're both bad.

    Thanks for bringing that up. I was about to, but you did it for me.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Hey all !!!

    we have Grant Fritchey here !!!!

    Thanks much for your help !!!!

    M

  • mario17 (9/3/2014)


    Hey all !!!

    we have Grant Fritchey here !!!!

    Thanks much for your help !!!!

    M

    Ha!

    I wasn't aware that was an event, especially with all these people.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (9/3/2014)


    mario17 (9/3/2014)


    Hey all !!!

    we have Grant Fritchey here !!!!

    Thanks much for your help !!!!

    M

    Ha!

    I wasn't aware that was an event, especially with all these people.

    It is an event (every time!). You just have transcended the point where you realize it is happening. 😉

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply