Create a scalar function (or not)

  • I always assumed that there would not be much difference of performance between selecting a complex expression and selecting a scalar function encapsulating the expression until I encountered the following case:

    My expression is quite complex so I wrapped it into a scalar function.
    CREATE FUNCTION TEST(@EntityID char(43), @RefDate DATETIME)
    RETURNS CHAR(1) AS BEGIN
    RETURN IIF(EXISTS(
     SELECT NULL FROM EntityOfficer D
     CROSS APPLY TimelineMatch(D.TimelineAwareID, '{0000-1FAC0BA8-5347-4580-8A3F-02C9BB305E0E}', '{0000-A414A143-A5C3-470B-BB9C-33FC1E84B542}', @RefDate)
     LEFT JOIN EntityFile EF ON EF.EntityID = D.EntityID
     WHERE D.IsPurged = 'F' AND D.OfficerID = @EntityID AND EF.IsPurged = 'F' AND EF.IsGroup = 'T'
     AND D.IsOtherBoard = 'F' AND D.IsExecutiveBoard = 'F'
    ), 'T', 'F')
    END

    I can then select the expression:

    SELECT
     IIF(EXISTS(SELECT NULL FROM EntityOfficer D CROSS APPLY TimelineMatch(D.TimelineAwareID, '{0000-1FAC0BA8-5347-4580-8A3F-02C9BB305E0E}', '{0000-A414A143-A5C3-470B-BB9C-33FC1E84B542}', '2001-01-01') LEFT JOIN EntityFile EF ON EF.EntityID = D.EntityID WHERE D.IsPurged = 'F' AND D.OfficerID = P.EntityID AND EF.IsPurged = 'F' AND EF.IsGroup = 'T' AND D.IsOtherBoard = 'F' AND D.IsExecutiveBoard = 'F'), 'T', 'F')
    FROM Person P

    or select the function:

    SELECT
     dbo.Test(P.EntityID, '2001-01-01')
    FROM Person P

    Selecting the function is about 5 times faster than selecting the expression. Both plans are attached and we clearly see that the «Function» plan is way more simple.

    Does it mean that wrapping complex expressions in scalar functions is a good practice?
    Is there something can I do for the «Expression» statement to be as fast as the «Function» statement?

    Thanks.

  • In this case, it appears that your expression already includes a function in it:  TimeLineMatch.   Not sure how that affects things or what kind of function it is.   I personally am not going to trust a zip file, and would be much more comfortable downloading a .sqlplan file.   Many others may feel similarly.   After all, we're effectively having to decide to trust "a stranger on the internet".   It would appear that in this case, the old axiom of "it depends" has certainly applied here.   One good test is better than 1,000 expert opinions.   Have you also compared resource usage between the various methodologies?

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • The generic "rule of thumb" is that scalar functions are usually bad from a performance perspective and the recommendation is usually to opt for either writing the expression directly or using an inline table valued function. However, it's not necessarily always the case and particularly with code that gets overly complex or that the optimiser isn't so good at (such as CASE and it's variants IIF, CHOOSE etc) then sometimes a scalar function can help. More so if you can get away with schema binding the function and SQL Server is able to identify it as deterministic.

  • Thanks andy,

    I moved the expression into an OUTER APPLY and the statement runs faster than ever (and that was expected).

  • Can you post the code for the "TimelineMatch" function that you're calling from "inline"?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here is the TimelineMatch function. I have to admit it is a strange function and its purpose is to avoid writing «LEFT JOIN TimelineEvent T1...», «LEFT JOIN TimelineEvent T2...» and «WHERE @RefDate IS NULL...» in every statement that needs to fetch a StartDate and an EndDate.

    CREATE FUNCTION [dbo].[TimelineMatch](@TimelineAwareID CHAR(43), @TimelineEventTypeID1 CHAR(43), @TimelineEventTypeID2 CHAR(43), @RefDate DATETIME)
    RETURNS TABLE AS RETURN
    SELECT
     T1.EffectiveDate AS StartDate,
     T2.EffectiveDate AS EndDate
    FROM (SELECT NULL AS X) N
    LEFT JOIN TimelineEvent T1 ON T1.TimelineAwareID = @TimelineAwareID AND T1.TimelineEventTypeID = @TimelineEventTypeID1 AND T1.IsPurged = 'F'
    LEFT JOIN TimelineEvent T2 ON T2.TimelineAwareID = @TimelineAwareID AND T2.TimelineEventTypeID = @TimelineEventTypeID2 AND T2.IsPurged = 'F'
    WHERE @RefDate IS NULL OR ((T1.EffectiveDate IS NULL OR T1.EffectiveDate <= @RefDate) AND (T2.EffectiveDate IS NULL OR T2.EffectiveDate >= @RefDate))

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply