Scalar functions and performance drop off

  • Hi,
    I have tried a few things now and I am not really getting anywhere. I have a database which is providing some relevancy scores basically telling me how useful a result is based on the range it falls within.
    This uses many scalar functions to calculate

    AS you can see it isnt the most complicated maths but does have references to other scalar functions.

    Following this there is then inline table valued functions e.g.

    The performance is really poor, we have tried converting all the scalar functions into TVF's but that isnt much help, we have also looked into turning the scalar functions into CLR's to improve performance but this didnt really help either.

    Does anyone have any suggestions  as to where we should go next to try to improve performance.

    Many thanks

    Oliver

  • While it's helpful to post your code, we clearly don't have all of it (e.g. dbo.erf()), and you haven't mentioned what kind of performance you are actually getting, other than to indicate it's not where you want it to be.   Can you post an actual execution plan?   We need to be able to see where the problem is coming from.   For example, how many rows come out of each of the CTE's in the TVF that you posted?  What are the indexes on those tables?   We need a LOT more info to be able to help much, as anything I say now would be pure speculation.   Be specific about run times and desired run times.   How many rows are in these tables?   How much data is there overall?   How may rows are expected as output?

    You ARE on a better track by using table valued functions, but the underlying query still has to perform well, and that's where the tuning efforts need to be focused.   Also, it pays to NOT nest function calls unless such nested calls are trivial where performance is concerned.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • I might also suggest that some of your CTEs might be better combined, returning more results, with the column(s) from the WHERE included. Not sure, but definitely need more information on the exec plan and where things are slow.

  • Thank you for your answers and I appreciate fully it is hard to get a full picture of the situation from my post. The problem is I have many of these scalar functions and simply cant post them all. I included the code to give you an idea of what it was like, basically we have an inline function leading which uses scalar functions which themselves use scalar functions. We are working with 100,000's of rows so not crazy volumes. The main issue is that they just keep running I havent had a query actually complete it ran for 15 hours - how can I get an execution plan if it wont run?

    We tried to convert more to TVF's, initially as you suggested Steve using CTE's but quickly reach the CTE limit. The next place I am going is to have table variables and just run update statements against these for each Case statement. Then break it down into smaller pieces to try to see from the execution plan where performance is falling off.

    I know  nested  scalar functions arent great but as you can see this isnt super complicated functions.

    Thank you for the support, the guidance to say I am on the right lines really helps.

  • For example the direction is to use cross apply to replace the scalar functions.

  • olibbhq - Thursday, December 7, 2017 11:49 AM

    Thank you for your answers and I appreciate fully it is hard to get a full picture of the situation from my post. The problem is I have many of these scalar functions and simply cant post them all. I included the code to give you an idea of what it was like, basically we have an inline function leading which uses scalar functions which themselves use scalar functions. We are working with 100,000's of rows so not crazy volumes. The main issue is that they just keep running I havent had a query actually complete it ran for 15 hours - how can I get an execution plan if it wont run?

    We tried to convert more to TVF's, initially as you suggested Steve using CTE's but quickly reach the CTE limit. The next place I am going is to have table variables and just run update statements against these for each Case statement. Then break it down into smaller pieces to try to see from the execution plan where performance is falling off.

    I know  nested  scalar functions arent great but as you can see this isnt super complicated functions.

    Thank you for the support, the guidance to say I am on the right lines really helps.

    Well, we can help a lot more if you can start breaking out individual queries and get execution plans for those as raw queries.   When you use CTE's and get 100K + rows, that might be part of the problem.   As the other Steve suggested, the one function you posted that joined multiple CTE's together might be brought to heel somewhat easily if we can just have more details, as the JOINs for 3 of those CTEs appear to be identical, so you might not need to go after those tables 3 separate times, and perhaps can just do so once, but again, more details are needed.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • olibbhq - Thursday, December 7, 2017 11:54 AM

    For example the direction is to use cross apply to replace the scalar functions.

    ALTER FUNCTION [dbo].[F_TR_C5](@mu1 float, @mu12 float,
                                                @mu2 float, @mu22 float,
                                                @t1 FLOAT,@t2 FLOAT)
    RETURNS TABLE
    AS
    RETURN
    (WITH I1 AS --CHECK INTERVALS ARE SELF CONSISTENT
            (SELECT @mu1 as mu1,
                    CASE WHEN @mu12 IS NULL THEN @mu1 WHEN @mu12>@mu22 THEN @mu22 ELSE @mu12 END AS mu12,
                    CASE WHEN @mu2<@mu1 THEN @mu1 ELSE @mu2 END as mu2,
                    CASE WHEN @mu22 IS NULL THEN @mu2 ELSE @mu22 END AS mu22),

        -- SORT OUT IF BOUNDING INTERVALS OVERLAP
        I2 AS (SELECT mu1,
                    CASE WHEN mu1=mu2 AND mu12=mu22 THEN mu1
                        WHEN mu12>mu2 THEN (mu2+mu12)/2 ELSE @mu12 END AS mu12,
                    CASE WHEN mu1=mu2 AND mu12=mu22 THEN mu22
                        WHEN mu12>mu2 THEN (mu2+mu12)/2 ELSE @mu2 END AS mu2,
                    mu22
                    FROM I1),
        
        -- Evaluate FUNCTION IN 3 parts
        X AS (SELECT CASE WHEN mu1 = mu12 THEN 0
                        WHEN @t1<=mu1 AND @t2>=mu12 THEN .5*(mu12-mu1)
                        WHEN @t1<mu12 AND @t2>=mu12 THEN 0.5*(mu12-@t1)*(mu12-2*mu1+@t1)/(mu12-mu1)
                        WHEN @t1<=mu1 AND @t2>mu1 THEN 0.5*(@t2-mu1)*(@t2-mu1)/(mu12-mu1)
                        WHEN @t1<mu12 AND @t2>mu1 THEN 0.5*(@t2-@t1)*(@t2+@t1-2*mu1)/(mu12-mu1)
                        ELSE 0 END AS X1,
                    CASE WHEN @t1<=mu12 AND @t2>=mu2 THEN mu2-mu12
                        WHEN @t1< mu2 AND @t2>=mu2 THEN mu2-@t1
                        WHEN @t1<=mu12 AND @t2>mu12 THEN @t2-mu12
                        WHEN @t1< mu2 AND @t2>mu12 THEN @t2-@t1
                        ELSE 0 END AS X2,
                    CASE WHEN mu2=mu22 THEN 0
                        WHEN @t1<=mu2 AND @t2>=mu22 THEN .5*(mu22-mu2)
                        WHEN @t1< mu22 AND @t2>=mu22 THEN 0.5*(mu22-@t1)*(mu22-@t1)/(mu22-mu2)
                        WHEN @t1<=mu2 AND @t2> mu2 THEN 0.5*(@t2*@t2-mu2*mu2+2*mu22*(mu2-@t2))/(mu2-mu22)
                        WHEN @t1< mu22 AND @t2> mu2 THEN 0.5*(@t2*@t2-@t1*@t1+2*mu22*(@t1-@t2))/(mu2-mu22)
                        ELSE 0 END AS X3,
                    CASE WHEN (mu2+mu22-mu1-mu12)*abs(@t2-@t1)=0 THEN NULL
                        ELSE 2/(mu2+mu22-mu1-mu12)/abs(@t2-@t1) END AS DT
                     FROM I2)

        -- COMBINE ELEMENTS AND CORRECT FOR SMALL INTERVALS
        SELECT CASE WHEN X1+X2+X3 <1 and DT>1 THEN power(X1+X2+X3,2)*DT
                    ELSE (X1+X2+X3)*DT END AS C5
            FROM X
    )

    ALTER FUNCTION [dbo].[F_TR_E5](@mu1 float, @mu12 float,
                                    @mu2 float, @mu22 float,
                                    @t1 FLOAT, @t12 float,
                                    @t2 FLOAT, @t22 float)
    RETURNS TABLE
    AS
    RETURN
    (
        -- Add the SELECT statement with parameter references here
        SELECT CASE WHEN P.C5>P1.C5 THEN P1.C5 ELSE P.C5 END AS E5
        FROM dbo.F_TR_C5(@mu1, @mu12, @mu2, @mu22,(@t1+@t12)/2,(@t2+@t22)/2) AS P
            CROSS APPLY dbo.F_TR_C5(@t1, @t12,@t2, @t22,(@mu1+@mu12)/2, (@mu2+@mu22)/2) AS P1
    )

    Be very careful with that.  It usually results in Nested Loops in the execution plan, so you could get a nasty surprise with two functions using cross apply that are table-valued and each returns a sizable number of rows, as you'll get a Cartesian product and the number of rows coming out could be gigantic.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply