Can using a scalar UDF actually improve performance?

  • I have something like this below:

    --query 1

    SELECT top 5000000 --top five mils

    dbo.[UDF_to_Format_Date]( x.transaction_date )

    ...

    FROM

    dbo.tableX AS x

    LEFT JOIN...

    (a bunch of LJs here)

    UDF is a straight forward one too:

    ALTER FUNCTION dbo.[UDF_to_Format_Date] (@input_date DATETIME )

    RETURNS VARCHAR( 50 )

    AS

    BEGIN

    RETURN CONVERT( VARCHAR( 50 ), @input_date, 121 )

    END

     

    Then I changed query 1 to select x.transaction_date itself , so now it's like

    --query 2

    SELECT top 5000000 --top five mils

    x.transaction_date     --field of type DATE

    ...

    FROM

    dbo.tableX AS x

    LEFT JOIN...

    (a bunch of LJs here)

    Comparing performances of query 1 vs query 2 I noticed that query 1 is always faster that query 2, meaning that it's running faster when it's using UDF

    How come??? :))))))

    • This topic was modified 4 years, 12 months ago by  btio_3000.
  • Can you please post the execution plans for both queries?

    😎

     

  • Also tell how how you're measuring what "faster" means.  If you're using % of Batch in an execution plan, it has the very high possibility of being incorrect compared to what really happens because it's a cost based estimate that mostly ignores the overhead of scalar functions.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Impossible to even guess at this without seeing the full sets of code and the execution plans. Also, when measuring the speed, also measure the resource use, CPU and the number of reads/writes. In isolation, you may see things perform faster, but use more resources that, when put into a production environment, actually hit more blocking and run slower because of the excessive resource use. Performance measurement is about measuring all the important bits, not simply one measure of time.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • What apparently happened was that it was initially tested over remote connection

    Timings were taken "physically", still it was a bit faster with UDF, every time... Still not sure why, could be a fluke?

    When tested at the server, the universe unfolded as it should and things started to make sense again

    Thank you, and I apologize for a panic attack

     

  • Hard to explain exactly what happened, but it's likely to be the network. It's good to measure some queries from the client to understand how your network behaves, but the vast majority of measurement should only be on the server. That should be where most of the variability lies (your network runs however fast your network runs and seldom changes... most of the time).

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • My guess would be that the cost for parallel is set too low and the non UDF query is running with a parallel plan, the UDF version does not run in parallel and therefore is faster. But then again, this is just a speculation as the OP hasn't responded to my request to post the execution plans.

    😎

     

    Just one of those posts where we get "I need an answer but I'm not telling you the question"

  • Hi Eirikur,

    and I apologize for that too. The reason I did not reply with the execution plan is that there is a lot of changes/testing going on, the "original" database was backed up and the one I currently have access to already has numerous changes applied

    And I did not save the original exec.plan.

    My question was, basically, if anyone had anything like that and/or if it's even possible

    Again thank you and again I apologize if I did not clearly phrase it

     

  • I'm thinking that "everything changes" when you're using a remote connection but don't know for sure.  I supposed "It Depends" on the connection settings.  If you've come across connection settings that make UDFs run even a little bit faster, then I'd love to know what they are because it might be a help to everyone provided it's not something like turning arith-abort or ansi-nulls off.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Still learning SQL stuff here, what is the problem with turning ansi nulls off.. I see it in a ton of SPs?

  • Have a look at the following link... pay particular attention to the Light Blue box of text in it...

    https://docs.microsoft.com/en-us/sql/t-sql/statements/set-ansi-nulls-transact-sql?view=sql-server-2017

    A whole lot of front-end Developers in the world wrote code based on ANSI NULLs being turned off (it does make comparisons with NULL values a whole lot easier).   All of a sudden, MS gets the ANSI "religion" and wants to take away the ability to turn ANSI NULLs OFF... can you imagine how much code in this world is going to break if they actually do that?

     

    Why would so many Developers write code with ANSI NULLs turned OFF?  I'm not sure.  However, if you right click on your server instance, then click on properties, then click on connections, you'll notice the default is to have ANSI NULLs "unselected" which means that it defaults to ANSI NULLs being off.  Maybe that's why (emphasizing that's just a speculation).

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Good call, Thanks

  • I noticed in the article you linked to, SSC Guru, that SQL 2017 will not allow you to set ANSI NULLs off. If that is the case, how are we going to filter anything based on wanting to see what is NULL in a column? There are lots of instances I can think of where you want to see records with incomplete data (i.e. NULL in a column you want populated). How would we do this with ANSI NULL set ON?

  • dbodell wrote:

    I noticed in the article you linked to, SSC Guru, that SQL 2017 will not allow you to set ANSI NULLs off. If that is the case, how are we going to filter anything based on wanting to see what is NULL in a column? There are lots of instances I can think of where you want to see records with incomplete data (i.e. NULL in a column you want populated). How would we do this with ANSI NULL set ON?

    The power of all computer languages is in its intrinsic functions.  T-SQL and SQL Server are no exception.

     

    With that in mind and with the idea teaching a man to fish, you should spend some serious time on the following page and the pages of the various links on that page.

    https://docs.microsoft.com/en-us/sql/t-sql/functions/functions?view=sql-server-2017

    As for your immediate question, please see the following page...

    https://docs.microsoft.com/en-us/sql/t-sql/queries/is-null-transact-sql?view=sql-server-2017

    As an intro to more "NULL" functionality, please see the following, as well...

    https://docs.microsoft.com/en-us/sql/t-sql/functions/isnull-transact-sql?view=sql-server-2017

    https://docs.microsoft.com/en-us/sql/t-sql/language-elements/coalesce-transact-sql?view=sql-server-2017

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.s.  I'm not 100% sure but I don't believe they've disabled the ability to turn ANSI NULLs off yet in 2017.  It has been an advisory for many years, though.  I still recommend that everyone get out of the habit of using WHERE SomeColumn = NULL and WHERE SomeColumn <> NULL or doing virtually any other direct relational comparison to NULL.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply