TSQL dense rank type question - finding the first date of a budget change (budget values can bounce back and forth in values)

  • I am trying to write a query to give me all "budget changes" (and the day they happened) per customer.

    Example is below:

    create table testTable

    (id tinyint NOT NULL, cust_id int NOT NULL, sDate datetime NOT NULL, budget DECIMAL(8,2) NOT NULL);

    INSERT INTO testTable (id,cust_id,sDate,budget) VALUES (1,1,'2015-01-01',100);

    INSERT INTO testTable (id,cust_id,sDate,budget) VALUES(2,1,'2015-02-01',100);

    INSERT INTO testTable (id,cust_id,sDate,budget) VALUES(9,1,'2015-03-01',500);

    INSERT INTO testTable (id,cust_id,sDate,budget) VALUES(4,1,'2015-04-01',100);

    INSERT INTO testTable (id,cust_id,sDate,budget) VALUES (3,2,'2015-01-01',50);

    INSERT INTO testTable (id,cust_id,sDate,budget) VALUES(7,2,'2015-02-01',10);

    INSERT INTO testTable (id,cust_id,sDate,budget) VALUES(5,2,'2015-03-01',50);

    INSERT INTO testTable (id,cust_id,sDate,budget) VALUES(8,2,'2015-04-01',50);

    Which looks like this:

    select * from testTable order by cust_id,sDate;

    idcust_idsDatebudget

    112015-01-01100.00

    212015-02-01100.00

    912015-03-01500.00

    412015-04-01100.00

    322015-01-0150.00

    722015-02-0110.00

    522015-03-0150.00

    822015-04-0150.00

    Desired output: (Give me all records that reflect budget changes from the previous period) which is rows: 1,9,4,3,7,5

    cust_idsDateBudget

    12015-01-01100

    12015-03-01500

    12015-04-01100

    22015-01-0150

    22015-02-0110

    22015-03-0150

    My best guess is:

    SELECT cust_Id,sDate,budget FROM

    (select cust_id,sDate,budget, DENSE_RANK() OVER (PARTITION BY cust_id,budget ORDER BY cust_id,sDate asc,budget) dr

    from testTable ) inlineT WHERE dr=1 order by 1,2;

    Which actually returns me incorrect results because the dense rank does not restart due to the budget change :

    cust_idsDate budget

    1 2015-01-01 100.00

    1 2015-03-01 500.00

    2 2015-02-01 10.00

    2 2015-01-01 50.00

  • Try this

    WITH CTE AS (

    SELECT cust_Id,sDate,budget,

    ROW_NUMBER() OVER(PARTITION BY cust_id ORDER BY sDate) -

    ROW_NUMBER() OVER(PARTITION BY cust_id,budget ORDER BY sDate) AS rnDiff

    FROM testTable)

    SELECT cust_id,

    MIN(sDate) AS sDate,

    budget

    FROM CTE

    GROUP BY cust_id,budget,rnDiff

    ORDER BY cust_id,MIN(sDate);

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • This will get you what you're after:

    SELECT

    cust_id

    , budget

    , LastYearBudget

    FROM

    (

    SELECT

    cust_id

    , budget

    , LAG(budget, 1, -1) OVER (PARTITION BY cust_id ORDER BY sDate) AS LastYearBudget

    FROM

    testTable

    ) tmp

    WHERE

    budget <> LastYearBudget

    AND

    LastYearBudget <> -1

    ;

  • Mark Cowne (4/17/2015)


    Try this

    WITH CTE AS (

    SELECT cust_Id,sDate,budget,

    ROW_NUMBER() OVER(PARTITION BY cust_id ORDER BY sDate) -

    ROW_NUMBER() OVER(PARTITION BY cust_id,budget ORDER BY sDate) AS rnDiff

    FROM testTable)

    SELECT cust_id,

    MIN(sDate) AS sDate,

    budget

    FROM CTE

    GROUP BY cust_id,budget,rnDiff

    ORDER BY cust_id,MIN(sDate);

    Quick word of caution, this query requires the set to be sorted three times, even when an optimal POC index is present.

    😎

  • Quick solution, similar to the one Ozzmodiar posted, with an additional POC index which eliminates any sorting operations in the execution plan.

    😎

    USE tempdb;

    GO

    SET NOCOUNT ON;

    IF OBJECT_ID(N'dbo.testTable') IS NOT NULL DROP TABLE dbo.testTable;

    create table dbo.testTable

    (

    id TINYINT NOT NULL

    ,cust_id INT NOT NULL

    ,sDate DATETIME NOT NULL

    ,budget DECIMAL(8,2) NOT NULL

    );

    INSERT INTO dbo.testTable (id,cust_id,sDate,budget)

    VALUES

    (1,1,'2015-01-01',100)

    ,(2,1,'2015-02-01',100)

    ,(9,1,'2015-03-01',500)

    ,(4,1,'2015-04-01',100)

    ,(3,2,'2015-01-01',50)

    ,(7,2,'2015-02-01',10)

    ,(5,2,'2015-03-01',50)

    ,(8,2,'2015-04-01',50)

    ;

    /* POC (Partition Order Covering) index for dbo.testTable */

    CREATE UNIQUE NONCLUSTERED INDEX UNQNCLIDX_DBO_TEST_TABLE_CUST_ID_SDATE_INCL_BUDGET ON dbo.testTable

    (

    cust_id ASC

    ,sDate ASC

    )

    INCLUDE

    (

    budget

    -- ,id /* Add this column if needed in the output */

    );

    ;WITH BASE_DATA AS

    (

    SELECT

    TT.cust_id

    ,TT.sDate

    ,TT.budget

    ,CASE

    WHEN TT.budget = LAG(TT.budget,1,-1) OVER

    (

    PARTITION BY TT.cust_id

    ORDER BY TT.sDate

    ) THEN 0

    ELSE 1

    END AS CHANGE_FLAG

    FROM dbo.testTable TT

    )

    SELECT

    BD.cust_id

    ,BD.sDate

    ,BD.budget

    FROM BASE_DATA BD

    WHERE BD.CHANGE_FLAG = 1;

    Results

    cust_id sDate budget

    ----------- ----------------------- --------

    1 2015-01-01 00:00:00.000 100.00

    1 2015-03-01 00:00:00.000 500.00

    1 2015-04-01 00:00:00.000 100.00

    2 2015-01-01 00:00:00.000 50.00

    2 2015-02-01 00:00:00.000 10.00

    2 2015-03-01 00:00:00.000 50.00

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply