replace chain of function calls with better code

  • We have a scalar function that is used to add a specific number of business days to a date however to dev who wrote it has gone overkill on the use of it and crashed the ETL - it now never finishes.

    We have approx 75k records in the table in question and depending on certian values the function can be called anywhere from 12 to 28 times per record (remeber 75k records).

    I need to refactor this code ASAP and I am wondering about the most performant way to achieve  it.

    I am thinking possibly using cross apply but unsure if it would gove any performace boost.

    Does anyone have any ideas?

     

    Cheers,

    Dave

  • Can you show us the code for the function?

    If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
    See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

  • Agreed, we need to see the code.

    And, do you have a table with non-workdays in it?  We'll need to know the DDL for that table to properly determine business days.

    SQL DBA,SQL Server MVP(07, 08, 09) Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: "If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them."

  • To add to everyone else asking about the code, are you sure it is the function that "crashed the ETL - it now never finishes."?  I am wondering if MAYBE you have some blocking going on at the table level causing it to not finish.  Might not hurt to review the execution plan as well.

    Might not hurt to look at how that function is called... 12-28 times per record MIGHT make sense to have the data tossed into a persisted column rather than repeated calls to the function.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!

  • david_h_edmonds wrote:

    We have a scalar function that is used to add a specific number of business days to a date however to dev who wrote it has gone overkill on the use of it and crashed the ETL - it now never finishes.

    We have approx 75k records in the table in question and depending on certian values the function can be called anywhere from 12 to 28 times per record (remeber 75k records).

    I need to refactor this code ASAP and I am wondering about the most performant way to achieve  it.

    I am thinking possibly using cross apply but unsure if it would gove any performace boost.

    Does anyone have any ideas?

    Cheers,

    Dave

    CROSS APPLY with a scalar function is like drinking to celebrate sobriety. 😀

    We're all asking to see the function because that will explain to us WHAT you need to do and we'll figure out a better way HOW to do it.

    If you're looking elsewhere for a solution and someone uses a recursive CTE to do it, forget it.  A While loop is faster than that.  Seriously... been there, done that.

    Also, let us know if you have a "Calendar" table and/or a "Holiday" table and, if you do, post the CREATE TABLE statement for the ones you have.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Morning all, sincere apologies for the radio silence.

    The function is as follows:

    create function 
    dbo.mydatefunction (@Date date, @DaysToAdd int)


    returns Date as
    begin

    declare @Dates table
    (
    Date Date,
    Holiday Int,
    Weekend Int
    )

    Insert into @Dates
    Select
    Date,
    Is_Holiday,
    Is_Weekend
    from
    (select
    Date,
    Is_Holiday,
    Is_Weekend,
    Row_Number() over (Order by Date Asc) as rn
    from
    dbo.D_Date
    where
    Date > @Date and Is_Holiday = 0 and Is_Weekend = 0
    ) as data
    where rn <= @DaysToAdd

    declare @ReturnDate Date = (Select MAX(Date) from @Dates);
    Return @ReturnDate
    End;

    Cheers,

    Dave

  • There are some very clever people in this forum who will certainly come up with great ideas to help you. But here is my first attempt at speeding things up.

    First, create a numbered VIEW over your date table:

    CREATE VIEW dbo.mydateview WITH SCHEMABINDING
    AS
    SELECT d.Date
    ,rn = ROW_NUMBER() OVER (ORDER BY d.Date ASC)
    FROM dbo.D_Date d
    WHERE d.Is_Holiday = 0
    AND d.Is_Weekend = 0;

    Next, change your function to be an iTVF which references this view. No need for any table variables nor inequalities in the WHERE clause:

    CREATE FUNCTION dbo.mydatefunction
    (
    @Date DATE
    ,@DaysToAdd INT
    )
    RETURNS TABLE
    AS
    RETURN
    (
    SELECT Date = LEAD(d.Date, @DaysToAdd) OVER (ORDER BY d.Date)
    FROM dbo.mydateview d
    WHERE d.Date = @Date
    );

     

    If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
    See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

  • Make sure the D_Date table is uniquely clustered on Date (it should already be, but verify, just in case).  I'd stick with a scalar function for now, but get rid of all the wasted things being done in the code.

    SET ANSI_NULLS ON;
    SET QUOTED_IDENTIFIER ON;
    GO
    CREATE FUNCTION dbo.mydatefunction (
    @Date date,
    @DaysToAdd int
    )
    RETURNS date
    AS
    BEGIN
    RETURN (
    SELECT MAX(Date) AS Date
    FROM (
    SELECT TOP (@DaysToAdd) Date
    FROM dbo.D_Date
    WHERE Date > @Date AND
    Holiday = 0 AND
    Weekend = 0
    ORDER BY Date
    ) AS derived
    )
    /*end of function*/
    END

    SQL DBA,SQL Server MVP(07, 08, 09) Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: "If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them."

  • If you're using this function against multiple rows in a query, there's no way that I'd leave it as a scalar function.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I would definitely convert that scalar function to an inline-table valued function.  You don't actually need a separate view to implement @Phil's solution - which can be done using either a CTE or derived-table to number the rows.

    It would also help to see the code that is using this function - and determine why it would be called for multiple columns.  Do you really have 12 to 28 different dates in a single row where you need to calculate a future working date?

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • I echo what the others are saying.  Can you provide some sample data as well as how you are determining the @DaysToAdd  parameter for each date?

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • In addition to the fact that dedicated development teams https://mlsdev.com/blog/medical-app-development are better able to meet customer demand, they tend to cost less to implement. Smaller businesses often have a difficult time increasing their workforce because salaries tend to be quite low. However, by employing dedicated development teams, businesses can increase their workforce size without incurring increased labor costs.

    • This reply was modified 2 days ago by  matilda.
  • The horrible truth is that our calendar system is very irregular in our definition of the business day is even worse. Trying to do some temporal calculations like this with the function will not work. I strongly recommend that you do a calendar table that includes an ordinal day and an ordinal business day. The DATE data type is small and you can put a few centuries in a relatively small table.

    Just as an exercise, assume that your company does not use Easter as a business day. Have you ever seen the computations required to compute Easter? But wait, it gets worse (I have a personal horror story on this one); which Easter? Catholic or Orthodox? Some years they match up. Some years they do not match up.. And sometimes business is called and postponed for things like, oh I don't know,say 9/11 or holidays set by decree.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • I very strongly prefer a separate "nonwork_days" table.  The table needs nothing only the date and a tinyint reason code for why that date is a nonwork day.

    The big problem with a calendar table being used to determine nonwork days is that people rely on a host of flags to determine work days, viz:

    SELECT COUNT(*) AS workdays_count FROM dbo.calendars WHERE is_weekend = 0 AND is_holiday = 0 ...

    Now you have code in dozens or hundreds or more places each determining work days.  Bad idea.  Even more critically, what happens then if you need to work on a weekend day for some reason?  It's almost impossible to do cleanly.  What if you need to work on a holiday because of an emergency? (such as COVID or a fire or flood).  Truly nearly impossible to do when code in hundreds of places is relying on "is_holiday" to control work logic.

    You naturally can use a calendar table for other purposes, such as determining fiscal year, etc.  But a basic calendar table should not be used for determining work days.

    SQL DBA,SQL Server MVP(07, 08, 09) Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: "If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them."

  • I second the "nonwork_days" table, but I would make sure to design the reason code table well.

    You would want to make sure that the reason code table has a reason code identifier, a reason code description, and a reason code location.  The location may need to be broken up into multiple columns to handle country, state/province, and potentially city.  This will allow you to quickly and easily account for holidays in the USA vs CANADA vs JAPAN vs CHINA vs etc.

     

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply