Find last record in last 12 months of current record

  • I need to build a solution and for brevity I'm going to use the example of customers buying products.

    I want to return, for the current full year (in this case 2018) customer records in the year (2018) that bought a product in the previous 12 months of that purchase.
    This means I'm looking at a rolling 12 month (potentially 24 months of data 2017-01-01- 2018-12-31).
    example:
    Customer A buys a product at 2018-08-10.  If customer has bought a product before this in the previous 12 months from 2017-08-10- 2018-08-10, Return the previous purchase record and so on. If customer A buys a product on 2018-01-01 (first day of current full year), then I'm looking as far back as 2017-01-01.

    CustomerpurchasedateLastpurchasedate DaysDifferencerepeat countRecStatus
    c120/12/201804/12/2018168
    c104/12/201825/07/20181327
    c125/07/201801/03/20181466
    c101/03/201805/02/2018245
    c105/02/201825/01/2018114
    c125/01/201805/01/2018203
    c105/01/201801/05/20172492
    c101/05/201701/01/20171201
    c101/01/2017null0delete
    c220/12/201804/12/2018166
    c204/12/201825/07/20181325
    c201/08/201805/02/20181774
    c225/07/201801/03/20181463
    c205/07/201825/01/20181612
    c205/05/201801/05/20173691
    c201/02/201710/01/2017220delete
    c210/01/2017null0delete

    You can see that for Customer C1, the first date of purchase in the current year is 2018-05-01.  there was a record made in the previous 12 months from this date which is 2018-05-01.  This means 2018-05-01 is included and is the first purchase.  As long as the subsequent purchases are in the current year and within 12 months of each other the repeat count increments.

    Customer C2s first record in the current month is 2018-05-05.  However the last purchase record is 2017-02-01 which is more than 12 months so 2018-05-05 is the first record since it's in the current year.

    I hope this all makes sense.  Ideally I'd like to have a similar result set to the table above.  I've tried the following as an example but this is not working as i can't delete the records the records mentioned above.
    Any help would be very appreciated.


    FROM (
             SELECT Row_number() OVER( partition BY customer ORDER BY purchasedate) AS rn, * FROM data
            WHERE purchasedate between '2017-01-01' and '2019-01-01'
            ) a
       LEFT JOIN
         (
             SELECT Row_number() OVER( partition BY customer ORDER BY purchasedate) AS rn, * FROM data
            WHERE purchasedate between '2018-01-01' and '2019-01-01'
            ) b
         ON b.rn + 1 = a.rn
          AND a.customer = b.customer

  • I'm not sure I understand. Are you looking for only 1 or 2 records per customer? Meaning a purchase within the last year (or calendar year) and possibly a purchase in the 12 months before that. Are you returning all purchases in 2018?

    I would tackle this initially in a series of CTEs to organize the logic and before you're getting the results you want. I'd get a specific set of test data and results you want, especially at the boundary cases.

    I think that you would get a list of customers that have had a purchase in the last year (or 2018). It's not quite clear if you mean calendar for the last purchase. From there, I'd use that list of customers and orders and search for an order that is within a dateadd(year, -1) of that date. That lets you get two lists of things. You could order those, but depending on how deeply you filter (all orders, 1, 2, etc), it may be more complex.

    Posting  some DDL and test data would be good. Show specific results with your explanation.

  • First, this is not the best way to post sample data.  I've created a quick and dirty method of doing it, because I'm a bit short on time.

    SELECT *
    INTO #data
    FROM (
    VALUES
        ('c1', CAST('20181220' AS DATE), CAST('20181204' AS DATE), 16, 8, NULL),
        ('c1', '20181204', '20180725', 132, 7, NULL),
        ('c1', '20180725', '20180301', 146, 6, NULL),
        ('c1', '20180301', '20180205', 24, 5, NULL),
        ('c1', '20180205', '20180125', 11, 4, NULL),
        ('c1', '20180125', '20180105', 20, 3, NULL),
        ('c1', '20180105', '20170501', 249, 2, NULL),
        ('c1', '20170501', '20170101', 120, 1, NULL),
        ('c1', '20170101', null, NULL, 0, 'delete'),
        ('c2', '20181220', '20181204', 16, 6, NULL),
        ('c2', '20181204', '20180725', 132, 5, NULL),
        ('c2', '20180801', '20180205', 177, 4, NULL),
        ('c2', '20180725', '20180301', 146, 3, NULL),
        ('c2', '20180705', '20180125', 161, 2, NULL),
        ('c2', '20180505', '20170501', 369, 1, NULL),
        ('c2', '20170201', '20170110', 22, 0, 'delete'),
        ('c2', '20170110', null, NULL, 0, 'delete')
    )v (Customer, purchasedate, Lastpurchasedate, DaysDifference, repeat_count, RecStatus)

    I also wasn't sure exactly what you were looking for with your ranges.  I interpreted it as customers who hadn't visited in 12 months as having "lapsed" and being considered "new" customers.

    WITH new_returning AS
    (
        SELECT *,
            CASE WHEN LEAD(purchasedate, 1, '9999-12-30') OVER(PARTITION BY Customer ORDER BY purchasedate) > DATEADD(YEAR, 1, purchasedate) THEN 1 ELSE 0 END AS new_returning_customer
        FROM #data
    )
    , customer_groups AS
    (
        SELECT *, SUM(new_returning_customer) OVER(PARTITION BY Customer ORDER BY purchasedate DESC) AS grp
        FROM new_returning
    )
    SELECT *
    FROM customer_groups
    WHERE grp = 1

    I'm also not quite getting the results you are expecting, but it is close.  Looking at the data, it matches what I expect.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply