AVG a range of dates while grouping them by supplier

  • I have multiple suppliers with a purchase order date. I'm trying to calculate the average time between purchase orders but keep it grouped by supplier. I thought I could assign a row_number and over order by, but that doesn't take into account any breaks for the change in supplier IDs.

    CREATE TABLE jec_test_supplier (supplier_id varchar(10), date_created datetime)

    INSERT INTO jec_test_supplier (supplier_id, date_created)

    VALUES (9,'03-14-17'),

    (9,'04-07-17'),

    (9,'04-27-17'),

    (9,'05-30-17'),

    (9,'06-15-17'),

    (9,'06-23-17'),

    (9,'07-21-17'),

    (9,'08-19-17'),

    (9,'09-08-17'),

    (9,'10-04-17'),

    (9,'10-21-17'),

    (9,'11-14-17'),

    (9,'12-14-17'),

    (9,'01-06-18'),

    (9,'01-23-18'),

    (9,'02-12-18'),

    (25,'03-20-17'),

    (25,'03-22-17'),

    (25,'03-22-17'),

    (25,'04-26-17'),

    (25,'05-05-17'),

    (25,'05-05-17'),

    (25,'05-06-17'),

    (25,'05-16-17'),

    (25,'05-30-17'),

    (25,'06-14-17'),

    (25,'06-21-17'),

    (25,'06-21-17'),

    (25,'06-27-17'),

    (25,'07-06-17'),

    (25,'07-15-17'),

    (25,'08-03-17'),

    (25,'08-22-17'),

    (25,'08-22-17'),

    (25,'08-22-17'),

    (25,'08-22-17'),

    (25,'08-22-17'),

    (25,'09-05-17'),

    (25,'09-21-17'),

    (25,'09-25-17'),

    (25,'10-14-17'),

    (25,'11-07-17'),

    (25,'11-14-17'),

    (25,'11-29-17'),

    (25,'12-28-17'),

    (25,'01-09-18'),

    (25,'01-22-18'),

    (25,'02-01-18'),

    (25,'02-22-18')

    Below didn't work since it didn't break by supplier.


    SELECT h.supplier_id,CONVERT(varchar(10), h.date_created, 101)

    , ROW_NUMBER() OVER


    (  ORDER BY (h.supplier_id), (h.date_created)


    ) AS ORD -- Sequential order of POs based on date: Ex. 1,2,3,4,5...


    FROM jec_test_supplier hWHERE h.date_created > GETDATE()-365


    --AND h.supplier_id = 9ORDER BY h.supplier_id , h.date_created

  • You need to partition the ROW_NUMBER function by your SUPPLIER_ID. The PARTITION BY statement is the group separator in a windowing function.

    Here's your code with ORD modified to reset on SUPPLIER_ID. I've also included an expression to get the difference in days between the current order date and the next order date using the LEAD windowing function.

    SELECT H.SUPPLIER_ID,
             CONVERT(VARCHAR(10), H.DATE_CREATED, 101) AS DATE_CREATED_TEXT,
             ROW_NUMBER() OVER(PARTITION BY H.SUPPLIER_ID ORDER BY H.DATE_CREATED) AS ORD, -- Sequential order of POs based on date: Ex. 1,2,3,4,5...
             DATEDIFF(DAY, H.date_created, LEAD(H.date_created) OVER(PARTITION BY H.SUPPLIER_ID ORDER BY H.DATE_CREATED)) AS DAYS_UNTIL_NEXT_ORDER
      FROM JEC_TEST_SUPPLIER AS H
      WHERE H.DATE_CREATED > GETDATE() - 365

    If you put that into a subquery and average the difference in days, I think you'd get the average days between orders that you're after.

    SELECT DATA.SUPPLIER_ID
      , AVG(CAST(DATA.DAYS_UNTIL_NEXT_ORDER AS DECIMAL(9,2))) AS AVERAGE_DAYS_BETWEEN_ORDERS
    FROM (
      SELECT H.SUPPLIER_ID,
             CONVERT(VARCHAR(10), H.DATE_CREATED, 101) AS DATE_CREATED_TEXT,
             ROW_NUMBER() OVER(PARTITION BY H.SUPPLIER_ID ORDER BY H.DATE_CREATED) AS ORD, -- Sequential order of POs based on date: Ex. 1,2,3,4,5...
             DATEDIFF(DAY, H.date_created, LEAD(H.date_created) OVER(PARTITION BY H.SUPPLIER_ID ORDER BY H.DATE_CREATED)) AS DAYS_UNTIL_NEXT_ORDER
      FROM JEC_TEST_SUPPLIER AS H
      WHERE H.DATE_CREATED > GETDATE() - 365
    ) AS DATA
    GROUP BY
      DATA.SUPPLIER_ID

  • That is wonderful. I knew I was missing something. The PARTITION makes sense once you see it on the screen. It works like a charm. The DATEDIFF really helps in the averaging also.Thanks so much.

  • One more question. There is the possibility that two or more POs to a single supplier can be cut on the same day. I'd like to exclude any additional POs that have the same date and only use one for the AVG calculation. I assume it's something like excluding this...
    DATEDIFF(DAY, H.date_created, LEAD(H.date_created) OVER(PARTITION BY H.SUPPLIER_ID ORDER BY H.DATE_CREATED))  > 0
    ...so that I don't see the range where there were 0 days between, but I'm not sure how to incorporate that into the second script above.

  • Figured it out. Added a row to the subquery to number the dates that were the same date.
    , ROW_NUMBER () OVER (PARTITION BY CONVERT(varchar(10),h.date_created,101) ORDER BY CONVERT(varchar(10),h.date_created,101)) AS podate
    Then added a WHERE clause to limit to only the first row.
    DATA.podate = 1
    Hope this helps if someone else needs it.

  • jcobb 20350 - Thursday, March 1, 2018 9:00 AM

    One more question. There is the possibility that two or more POs to a single supplier can be cut on the same day. I'd like to exclude any additional POs that have the same date and only use one for the AVG calculation. I assume it's something like excluding this...
    DATEDIFF(DAY, H.date_created, LEAD(H.date_created) OVER(PARTITION BY H.SUPPLIER_ID ORDER BY H.DATE_CREATED))  > 0
    ...so that I don't see the range where there were 0 days between, but I'm not sure how to incorporate that into the second script above.

    How about the following:SELECT
        D.SUPPLIER_ID,
        AVG(CAST(D.DAYS_UNTIL_NEXT_ORDER AS DECIMAL(9,2))) AS AVERAGE_DAYS_BETWEEN_ORDERS
    FROM (
    SELECT H.SUPPLIER_ID,
       CONVERT(VARCHAR(10), H.DATE_CREATED, 101) AS DATE_CREATED_TEXT,
       --ROW_NUMBER() OVER(PARTITION BY H.SUPPLIER_ID ORDER BY H.DATE_CREATED) AS ORD, -- Sequential order of POs based on date: Ex. 1,2,3,4,5...
       DATEDIFF(DAY, H.date_created, LEAD(H.date_created) OVER(PARTITION BY H.SUPPLIER_ID ORDER BY H.DATE_CREATED)) AS DAYS_UNTIL_NEXT_ORDER
    FROM (
            SELECT SUPPLIER_ID, DATE_CREATED
            FROM JEC_TEST_SUPPLIER
            WHERE DATE_CREATED > DATEADD(year, -1, GETDATE())
            GROUP BY SUPPLIER_ID, DATE_CREATED
            ) AS H
        ) AS D
    GROUP BY D.SUPPLIER_ID;

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • jcobb 20350 - Thursday, March 1, 2018 10:22 AM

    Figured it out. Added a row to the subquery to number the dates that were the same date.
    , ROW_NUMBER () OVER (PARTITION BY CONVERT(varchar(10),h.date_created,101) ORDER BY CONVERT(varchar(10),h.date_created,101)) AS podate
    Then added a WHERE clause to limit to only the first row.
    DATA.podate = 1
    Hope this helps if someone else needs it.

    Not sure that's accurate.  You didn't include the supplier id in the PARTITION BY.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • Thank you for double checking my work. I was focused on the date and forgot to partition by supplier. Now it looks right.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply