Average across non-business days

  • Hi all,

    I need to calculate average rates across days. I do not have records for all days, but I need to consider the days without the rates as carrying the rates of the previous day, for example

    • 2nd-November : 8
    • 3rd-November : 8
    • 4th-November : 6
    • 7th-November : 8

    The above 4 records should generate an average of exactly 7. As we do not have records for November 5 and November 6th, those dates should take the last known rate, which was posted on November 4th.

    What script I can write to meet this need?

  • You have not given us much to work with.

    Here is a generic pattern that should get you started

    DECLARE @StartDate date = '2019-11-02';
    DECLARE @EndDate date = '2019-11-07';

    WITH T(N) AS (SELECT N FROM (VALUES (0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS X(N))
    , DATES(TheDate) AS (SELECT TOP(DATEDIFF(dd, @StartDate, @EndDate) +1)
    TheDate = DATEADD(DD, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1, @StartDate)
    FROM T AS T1 -- MAX Value = 16^1 = 16
    CROSS JOIN T AS T2 -- MAX Value = 16^2 = 256
    CROSS JOIN T AS T3 -- MAX Value = 16^3 = 4,096
    CROSS JOIN T AS T4 -- MAX Value = 16^4 = 65,536
    )
    SELECT dd.TheDate
    , RateVal = ISNULL(src.[YourRateCol], r.[YourRateCol])
    FROM DATES AS dd
    LEFT JOIN [YourSchema].[YourTable] AS src
    ON dd.TheDate = src.[YourDateColumn]
    OUTER APPLY (
    SELECT TOP(1) rr.[YourRateCol]
    FROM [YourSchema].[YourTable] AS rr
    WHERE rr.[YourDateColumn] < src.[YourDateColumn]
    ) AS r
  • you need to use a table of dates (possibly a recursive CTE) and figure out how to put  the last working day's value in to the table.

    I think you might need to add some detail to the question - as the average there is 7.5

    MVDBA

  • Thank you Mike.

    If you say that the average is 7.5, then you misunderstood the question.

    There are 4 records but there are actually 6 instances that affect the average: all dates from November secon to November Seven inclusive. The average is 8+8+6+6+6+8 divided by 6 which makes 7.

     

  • Please learn the difference between a row and a record. It's very important for RDBMS.

    I also wish you would learn to read the posting netiquette for this forum. Where is your DDL? Now we have to do all the work that you fat or too rude to do for us. Do you already have a calendar table? Until you tell us this, we can only guess at what you currently have. Should the missing rose be inserted? Or should we use a view? Or were you intending that we do your job for you? Please make a little effort for us.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • MVDBA (Mike Vessey) wrote:

    you need to use a table of dates (possibly a recursive CTE) and figure out how to put  the last working day's value in to the table.

    I think you might need to add some detail to the question - as the average there is 7.5

    Gosh... I hope it's not a recursive CTE.  Even for just a small handful of rows, they're painfully inefficient and slow when it comes to generating things like dates.  Please see the following...

    https://www.sqlservercentral.com/articles/hidden-rbar-counting-with-recursive-ctes

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • barak wrote:

    Hi all,

    I need to calculate average rates across days. I do not have records for all days, but I need to consider the days without the rates as carrying the rates of the previous day, for example

      <li style="list-style-type: none;">

    • 2nd-November : 8
      <li style="list-style-type: none;">

    • 3rd-November : 8
      <li style="list-style-type: none;">

    • 4th-November : 6
      <li style="list-style-type: none;">

    • 7th-November : 8

     

    The above 4 records should generate an average of exactly 7. As we do not have records for November 5 and November 6th, those dates should take the last known rate, which was posted on November 4th.

    What script I can write to meet this need?

    Understood... I think I might have a fun way to pull this one off without the use of a Calendar table.  I'll be back...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok... here's the code to create and populate the test table.  This is what Joe Celko was talking about.  Please see the article at the first link in my signature line below to help us help you on future posts.

    -- drop table #MyHead
    --===== Create the test table
    CREATE TABLE #MyHead
    (
    Date DATE PRIMARY KEY CLUSTERED
    ,Rate INT
    )
    ;
    INSERT INTO #MyHead WITH (TABLOCK)
    (Date,Rate)
    VALUES ('02 Nov 2019',8)
    ,('03 Nov 2019',8)
    ,('04 Nov 2019',6)
    ,('07 Nov 2019',8)
    ;

    And this demonstrates the solution I was talking about.  No Calendar table required.

    --===== This solves the problem as requested
    WITH CTE AS
    (
    SELECT Days = DATEDIFF(dd,Date,LEAD(Date,1,DATEADD(dd,1,Date)) OVER (ORDER BY Date))
    ,Rate
    FROM #MyHead
    )
    SELECT AvgRate = SUM(Rate*Days+0.0)/SUM(Days)
    FROM CTE
    ;

    Basically, LEAD is used to find the next date.  If there is no next date, 1 is added to current date.

    Then we calculate the difference in days between that "next" date we just calculated and the current date.  We keep that count as"Days".

    From there, the math is pretty straight forward.  For each row, multiply the number of days times the rate and take the sum of that.  Then, just divide that by the total number of days.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Darnit Jeff

    Where does you mind go to come up with such perlers?

  • but joe was talking about roses (read his post) - jeff , where are my roses?

    MVDBA

  • jcelko212 32090 wrote:

    Please learn the difference between a row and a record. It's very important for RDBMS.

    I also wish you would learn to read the posting netiquette for this forum. Where is your DDL? Now we have to do all the work that you fat or too rude to do for us. Do you already have a calendar table? Until you tell us this, we can only guess at what you currently have. Should the missing rose be inserted? Or should we use a view? Or were you intending that we do your job for you? Please make a little effort for us.

    Apologies,

    Didn't mean to offend you or anyone else.

    I should have said that I didn't explain myself well rather than saying that he misunderstood my point.

    Anyway, at 70kg, I don't think I am that fat 🙂

  • Jeff Moden wrote:

    Ok... here's the code to create and populate the test table.  This is what Joe Celko was talking about.  Please see the article at the first link in my signature line below to help us help you on future posts.

    -- drop table #MyHead
    --===== Create the test table
    CREATE TABLE #MyHead
    (
    Date DATE PRIMARY KEY CLUSTERED
    ,Rate INT
    )
    ;
    INSERT INTO #MyHead WITH (TABLOCK)
    (Date,Rate)
    VALUES ('02 Nov 2019',8)
    ,('03 Nov 2019',8)
    ,('04 Nov 2019',6)
    ,('07 Nov 2019',8)
    ;

    And this demonstrates the solution I was talking about.  No Calendar table required.

    --===== This solves the problem as requested
    WITH CTE AS
    (
    SELECT Days = DATEDIFF(dd,Date,LEAD(Date,1,DATEADD(dd,1,Date)) OVER (ORDER BY Date))
    ,Rate
    FROM #MyHead
    )
    SELECT AvgRate = SUM(Rate*Days+0.0)/SUM(Days)
    FROM CTE
    ;

    Basically, LEAD is used to find the next date.  If there is no next date, 1 is added to current date.

    Then we calculate the difference in days between that "next" date we just calculated and the current date.  We keep that count as"Days".

    From there, the math is pretty straight forward.  For each row, multiply the number of days times the rate and take the sum of that.  Then, just divide that by the total number of days.

    Brilliant!

    This is the kind of answer I was looking for. succinct and sophisticated.  I will try it next week and see how it goes

    Thanks a lot

  • barak wrote:

    jcelko212 32090 wrote:

    Please learn the difference between a row and a record. It's very important for RDBMS.

    I also wish you would learn to read the posting netiquette for this forum. Where is your DDL? Now we have to do all the work that you fat or too rude to do for us. Do you already have a calendar table? Until you tell us this, we can only guess at what you currently have. Should the missing rose be inserted? Or should we use a view? Or were you intending that we do your job for you? Please make a little effort for us.

    Apologies,

    Didn't mean to offend you or anyone else.

    I should have said that I didn't explain myself well rather than saying that he misunderstood my point.

    Anyway, at 70kg, I don't think I am that fat 🙂

    You've not offended anyone.  That's just Celko playing the part of a consummate troll, especially on newbies.  He thinks they have to get past him as some sort of right of passage.  If anything on this thread is fat, it's his ego. 😉  It's a shame to see someone I used to look up to stoop so low on such a regular basis.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • barak wrote:

    I will try it next week and see how it goes

    Awesome.  Thank you and looking forward to feedback.  This could easily be turned into a high performance "iSF" (Inline Scalar Function), which is really an iTVF (Inline Table Valued Function) that returns a single value.  If you're using SQL Server 2019, it might "inline" itself.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • at 70kg you are pretty much the same as me. - I don't think joe meant it that way

    MVDBA

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply