rolling monthly analysis

  • Hi All,

    I'd like to perform a rolling monthly analysis.

    Let's say I have a sales table and if a client has sales during a particular month a record gets entered into the table with the month end date indicating sales activity during that month.

    I want to determine when there is a new client and when a client is lost.

    Logically I would use NOT IN logic to compare the second month against the first and then the first month against the second to determine the second months results (February in my sample data)

    Then I would compare the third month(March) against the second( February). And so on...

    If a client drops out and then comes back we can still consider this a 'NewClient'

    The number of months is indeterminate.

    Thanks if you can help.

    CREATE TABLE #Sales

    (

    ClientID int,

    SaleDate date

    )

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120131')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120131')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (3, '20120131')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120131')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120229')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120229')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120229')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120229')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120331')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120331')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120331')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120331')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120430')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120430')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120430')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120531')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120531')

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120531')

    SELECT * FROM #Sales

    DROP TABLE #Sales

    --Expected Results

    ClientAction,MonthYear,ClientID

    ----------------------

    Newclient,February 2012,5

    Lostclient,February 2012,3

    Lostclient,March 2012,4

    Newclient,March 2012,6

    Lostclient,April 2012,2

    Lostclient,April 2012,5

    Newclient,April 2012,7

    Lostclient,May 2012,6

    Newclient,May 2012,5

  • Hi,

    I've solved a problem similar to this before, and I can't for the life of me figure out how I did it.

    Maybe the code below will put you (or someone) on the right track...

    Good luck!

    ;WITH Moo AS

    (

    SELECT ClientID_L= T1.ClientID

    ,SaleDate_L= T1.SaleDate

    ,ClientID_R= T2.ClientID

    ,SaleDate_R= T2.SaleDate

    FROM#Sales T1

    LEFT JOIN#Sales T2 ON T1.ClientID = T2.ClientID AND DATEDIFF(MONTH, T1.SaleDate, T2.SaleDate) = 1

    ),

    Moo2 AS

    (

    SELECT ClientID_L

    ,SaleDate_L

    ,sStatus =CASE

    WHEN ClientID_R IS NOT NULL THEN 'Start' ELSE 'End'

    END

    FROM Moo

    )

    SELECT *

    FROM Moo2

    ORDER BY ClientID_L, SaleDate_L

  • Here using a full join where both sides are null we can get all our new or old clients and depending on where the nulls are can assign whether they were a new client or lost client.

    DECLARE @StartMonth DATETIME = (SELECT DATEADD(MM,DATEDIFF(MM,0,MIN(SaleDate)) + 1,0) FROM #Sales)

    DECLARE @EndMonth DATETIME = (SELECT DATEADD(MM,DATEDIFF(MM,0,MAX(SaleDate)),0) FROM #Sales)

    SELECT CASE WHEN s2.ClientID IS NULL THEN 'NewClient' ELSE 'LostClient' END AS ClientAction,

    RIGHT(CONVERT(VARCHAR, ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)), 106), 8) AS SaleDate,

    ISNULL(s1.ClientID, s2.ClientID) AS ClientID

    FROM #Sales s1

    FULL JOIN #Sales s2

    ON s1.ClientID = s2.ClientID

    AND MONTH(s1.SaleDate) = MONTH(s2.SaleDate) + 1

    WHERE (s1.ClientID IS NULL

    OR s2.ClientID IS NULL)

    AND (s2.SaleDate < @EndMonth

    OR s1.SaleDate > @StartMonth)

    ORDER BY ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)), ISNULL(s1.ClientID, s2.ClientID)

    EDIT: Fixed code to eliminate the fact that all clients in january were new (no DEC 2011 DATA) and all clients in Jun were lost (no data for JUN 2012).

    this should work when you add in more months as i take the max and min from the table.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • Something like this maybe?

    CREATE TABLE #Sales

    (

    ClientID int,

    SaleDate date

    );

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120131');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120131');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (3, '20120131');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120131');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120229');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120229');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120229');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120229');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120331');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120331');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120331');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120331');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120430');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120430');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120430');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120531');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120531');

    INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120531');

    --SELECT * FROM #Sales;

    declare @StartDate date = '20120101',

    @EndDate date = '20120630';

    WITH SalesActivity as (

    SELECT

    CASE WHEN s2.ClientID IS NULL

    THEN 'NewClient'

    ELSE 'LostClient'

    END AS ClientAction,

    RIGHT(CONVERT(VARCHAR, ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)), 106), 8) AS SaleDate,

    ISNULL(s1.ClientID, s2.ClientID) AS ClientID,

    ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)) SalesDate

    FROM

    #Sales s1

    FULL JOIN #Sales s2

    ON s1.ClientID = s2.ClientID

    AND MONTH(s1.SaleDate) = MONTH(s2.SaleDate) + 1

    WHERE

    s1.ClientID IS NULL

    OR s2.ClientID IS NULL

    )

    SELECT

    ClientAction,

    SaleDate,

    ClientID

    FROM

    SalesActivity

    WHERE

    SalesDate between dateadd(mm,1,@StartDate) and dateadd(mm,datediff(mm,0,@EndDate),-1)

    ORDER BY

    SalesDate,

    ClientID;

    GO

    DROP TABLE #Sales

    go

  • That's what I get for not checking first.

  • Lynn Pettis (8/2/2012)


    That's what I get for not checking first.

    LOL :w00t::w00t:

    i posted what i had figuring on working on it and actually expecting some one to beat me to it as i was getting a little busy with other stuff.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • capn.hector (8/2/2012)


    Lynn Pettis (8/2/2012)


    That's what I get for not checking first.

    LOL :w00t::w00t:

    i posted what i had figuring on working on it and actually expecting some one to beat me to it as i was getting a little busy with other stuff.

    I got interupted as well.

  • I'm fascinated.....I am working with Lynn's code.

    Say I wanted certain clients those whose ClientID was < than 3.

    Clearly I can add any conditions to the final SELECT statement but performance would suffer if I had a multitude of clients.

    Can I filter on the CTE?

    I tried the following but did not get the expected results.

    WHERE

    (s1.ClientID IS NULL OR s1.ClientID < 3)

    OR

    (s2.ClientID IS NULL OR s2.ClientID < 3)

    I appreciate the input. Obviously these are not real tables and data, just an abstraction of my real problem. Sometimes I over simplify things and don't anticipate all my needs. Bust since I am not exposing real data or data structure I get to keep my job! 😀

    I am also wondering if I should solve each problem (new and lost) separately and union the results...

  • Chrissy321 (8/2/2012)


    I'm fascinated.....I am working with Lynn's code.

    Say I wanted certain clients those whose ClientID was < than 3.

    Clearly I can add any conditions to the final SELECT statement but performance would suffer if I had a multitude of clients.

    Can I filter on the CTE?

    I tried the following but did not get the expected results.

    WHERE

    (s1.ClientID IS NULL OR s1.ClientID < 3)

    OR

    (s2.ClientID IS NULL OR s2.ClientID < 3)

    I appreciate the input. Obviously these are not real tables and data, just an abstraction of my real problem. Sometimes I over simplify things and don't anticipate all my needs. Bust since I am not exposing real data or data structure I get to keep my job! 😀

    I am also wondering if I should solve each problem (new and lost) separately and union the results...

    Lynn took my code and wrapped it so that Jan 2012 would not have 3 "New" clients and Jun 2012 would not have every client "Lost" i was able to fix my code before lynn posted his reply.

    i am currently working on a test bed as i do believe my code will be more performant as i only have the single select but as i may get busy at work it may wait till monday.


    For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

    For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

    Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
    Jeff Moden's Cross tab and Pivots Part 1[/url]
    Jeff Moden's Cross tab and Pivots Part 2[/url]

  • Thanks capn. I am going to look at Mr Celko's approach with the Months table since it may greatly simplify matters.

  • I ended up using a Months table, see below. This seems to simplify the analysis greatly.

    Question: Is it appropriate to index every column in a calendar table like this that won't ever be updated and will only ever have on index??

    I have 500 or so rows in this table.

    CREATE TABLE [reference].[Months](

    [Month_Name] [varchar](9) NOT NULL,

    [Month_Year] [int] NOT NULL,

    [Month_FirstDay] [date] NOT NULL,

    [Month_LastDay] [date] NOT NULL,

    [PreviousMonth_FirstDay] [date] NOT NULL,

    [PreviousMonth_LastDay] [date] NOT NULL,

    [NextMonth_FirstDay] [date] NOT NULL,

    [NextMonth_LastDay] [date] NOT NULL,

    [Quarter_Name] [varchar](7) NOT NULL,

    CONSTRAINT [PK_Months_1] PRIMARY KEY CLUSTERED

    (

    [Month_Name] ASC,

    [Month_Year] ASC,

    [Month_FirstDay] ASC,

    [Month_LastDay] ASC,

    [PreviousMonth_FirstDay] ASC,

    [PreviousMonth_LastDay] ASC,

    [NextMonth_FirstDay] ASC,

    [NextMonth_LastDay] ASC,

    [Quarter_Name] ASC

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply