rolling monthly analysis

Question

Post reply

rolling monthly analysis

Chrissy321

SSCoach

Points: 15750
More actions
August 2, 2012 at 11:09 am

#264569

Hi All,
I'd like to perform a rolling monthly analysis.
Let's say I have a sales table and if a client has sales during a particular month a record gets entered into the table with the month end date indicating sales activity during that month.
I want to determine when there is a new client and when a client is lost.
Logically I would use NOT IN logic to compare the second month against the first and then the first month against the second to determine the second months results (February in my sample data)
Then I would compare the third month(March) against the second( February). And so on...
If a client drops out and then comes back we can still consider this a 'NewClient'
The number of months is indeterminate.
Thanks if you can help.
CREATE TABLE #Sales
(
ClientID int,
SaleDate date
)
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120131')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120131')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (3, '20120131')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120131')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120229')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120229')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120229')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120229')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120331')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120331')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120331')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120331')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120430')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120430')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120430')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120531')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120531')
INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120531')
SELECT * FROM #Sales
DROP TABLE #Sales
--Expected Results
ClientAction,MonthYear,ClientID
----------------------
Newclient,February 2012,5
Lostclient,February 2012,3
Lostclient,March 2012,4
Newclient,March 2012,6
Lostclient,April 2012,2
Lostclient,April 2012,5
Newclient,April 2012,7
Lostclient,May 2012,6
Newclient,May 2012,5

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

diamondgm SSCrazy Points: 2824 More actions · Answer 1

Hi,

I've solved a problem similar to this before, and I can't for the life of me figure out how I did it.

Maybe the code below will put you (or someone) on the right track...

Good luck!

;WITH Moo AS

(

SELECT ClientID_L= T1.ClientID

,SaleDate_L= T1.SaleDate

,ClientID_R= T2.ClientID

,SaleDate_R= T2.SaleDate

FROM#Sales T1

LEFT JOIN#Sales T2 ON T1.ClientID = T2.ClientID AND DATEDIFF(MONTH, T1.SaleDate, T2.SaleDate) = 1

),

Moo2 AS

(

SELECT ClientID_L

,SaleDate_L

,sStatus =CASE

WHEN ClientID_R IS NOT NULL THEN 'Start' ELSE 'End'

END

FROM Moo

)

SELECT *

FROM Moo2

ORDER BY ClientID_L, SaleDate_L

CapnHector SSCoach Points: 16143 More actions · Answer 2

Here using a full join where both sides are null we can get all our new or old clients and depending on where the nulls are can assign whether they were a new client or lost client.

DECLARE @StartMonth DATETIME = (SELECT DATEADD(MM,DATEDIFF(MM,0,MIN(SaleDate)) + 1,0) FROM #Sales)

DECLARE @EndMonth DATETIME = (SELECT DATEADD(MM,DATEDIFF(MM,0,MAX(SaleDate)),0) FROM #Sales)

SELECT CASE WHEN s2.ClientID IS NULL THEN 'NewClient' ELSE 'LostClient' END AS ClientAction,

RIGHT(CONVERT(VARCHAR, ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)), 106), 8) AS SaleDate,

ISNULL(s1.ClientID, s2.ClientID) AS ClientID

FROM #Sales s1

FULL JOIN #Sales s2

ON s1.ClientID = s2.ClientID

AND MONTH(s1.SaleDate) = MONTH(s2.SaleDate) + 1

WHERE (s1.ClientID IS NULL

OR s2.ClientID IS NULL)

AND (s2.SaleDate < @EndMonth

OR s1.SaleDate > @StartMonth)

ORDER BY ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)), ISNULL(s1.ClientID, s2.ClientID)

EDIT: Fixed code to eliminate the fact that all clients in january were new (no DEC 2011 DATA) and all clients in Jun were lost (no data for JUN 2012).

this should work when you add in more months as i take the max and min from the table.

For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 3

Something like this maybe?

CREATE TABLE #Sales

(

ClientID int,

SaleDate date

);

INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120131');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120131');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (3, '20120131');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120131');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120229');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120229');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (4, '20120229');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120229');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120331');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (2, '20120331');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120331');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120331');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120430');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (6, '20120430');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120430');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (1, '20120531');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (5, '20120531');

INSERT INTO #Sales (ClientID, SaleDate)VALUES (7, '20120531');

--SELECT * FROM #Sales;

declare @StartDate date = '20120101',

@EndDate date = '20120630';

WITH SalesActivity as (

SELECT

CASE WHEN s2.ClientID IS NULL

THEN 'NewClient'

ELSE 'LostClient'

END AS ClientAction,

RIGHT(CONVERT(VARCHAR, ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)), 106), 8) AS SaleDate,

ISNULL(s1.ClientID, s2.ClientID) AS ClientID,

ISNULL(s1.SaleDate, DATEADD(M,1,s2.SaleDate)) SalesDate

FROM

#Sales s1

FULL JOIN #Sales s2

ON s1.ClientID = s2.ClientID

AND MONTH(s1.SaleDate) = MONTH(s2.SaleDate) + 1

WHERE

s1.ClientID IS NULL

OR s2.ClientID IS NULL

)

SELECT

ClientAction,

SaleDate,

ClientID

FROM

SalesActivity

WHERE

SalesDate between dateadd(mm,1,@StartDate) and dateadd(mm,datediff(mm,0,@EndDate),-1)

ORDER BY

SalesDate,

ClientID;

GO

DROP TABLE #Sales

go

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 4

Lynn Pettis

SSC Guru

Points: 442467

August 2, 2012 at 2:21 pm

#1521133

That's what I get for not checking first.

CapnHector SSCoach Points: 16143 More actions · Answer 5

Lynn Pettis (8/2/2012)
That's what I get for not checking first.

LOL :w00t::w00t:

i posted what i had figuring on working on it and actually expecting some one to beat me to it as i was getting a little busy with other stuff.

For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 6

capn.hector (8/2/2012)
Lynn Pettis (8/2/2012)
That's what I get for not checking first.
LOL :w00t::w00t:
i posted what i had figuring on working on it and actually expecting some one to beat me to it as i was getting a little busy with other stuff.

I got interupted as well.

Chrissy321 SSCoach Points: 15750 More actions · Answer 7

I'm fascinated.....I am working with Lynn's code.

Say I wanted certain clients those whose ClientID was < than 3.

Clearly I can add any conditions to the final SELECT statement but performance would suffer if I had a multitude of clients.

Can I filter on the CTE?

I tried the following but did not get the expected results.

WHERE

(s1.ClientID IS NULL OR s1.ClientID < 3)

OR

(s2.ClientID IS NULL OR s2.ClientID < 3)

I appreciate the input. Obviously these are not real tables and data, just an abstraction of my real problem. Sometimes I over simplify things and don't anticipate all my needs. Bust since I am not exposing real data or data structure I get to keep my job! 😀

I am also wondering if I should solve each problem (new and lost) separately and union the results...

CapnHector SSCoach Points: 16143 More actions · Answer 8

Chrissy321 (8/2/2012)
I'm fascinated.....I am working with Lynn's code.
Say I wanted certain clients those whose ClientID was < than 3.
Clearly I can add any conditions to the final SELECT statement but performance would suffer if I had a multitude of clients.
Can I filter on the CTE?
I tried the following but did not get the expected results.
WHERE
(s1.ClientID IS NULL OR s1.ClientID < 3)
OR
(s2.ClientID IS NULL OR s2.ClientID < 3)
I appreciate the input. Obviously these are not real tables and data, just an abstraction of my real problem. Sometimes I over simplify things and don't anticipate all my needs. Bust since I am not exposing real data or data structure I get to keep my job! 😀
I am also wondering if I should solve each problem (new and lost) separately and union the results...

Lynn took my code and wrapped it so that Jan 2012 would not have 3 "New" clients and Jun 2012 would not have every client "Lost" i was able to fix my code before lynn posted his reply.

i am currently working on a test bed as i do believe my code will be more performant as i only have the single select but as i may get busy at work it may wait till monday.

For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden[/url] for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw[/url]

Need to Split some strings? Jeff Moden's DelimitedSplit8K[/url]
Jeff Moden's Cross tab and Pivots Part 1[/url]
Jeff Moden's Cross tab and Pivots Part 2[/url]

Chrissy321 SSCoach Points: 15750 More actions · Answer 9

Thanks capn. I am going to look at Mr Celko's approach with the Months table since it may greatly simplify matters.

Chrissy321 SSCoach Points: 15750 More actions · Answer 10

I ended up using a Months table, see below. This seems to simplify the analysis greatly.

Question: Is it appropriate to index every column in a calendar table like this that won't ever be updated and will only ever have on index??

I have 500 or so rows in this table.

CREATE TABLE [reference].[Months](

[Month_Name] [varchar](9) NOT NULL,

[Month_Year] [int] NOT NULL,

[Month_FirstDay] [date] NOT NULL,

[Month_LastDay] [date] NOT NULL,

[PreviousMonth_FirstDay] [date] NOT NULL,

[PreviousMonth_LastDay] [date] NOT NULL,

[NextMonth_FirstDay] [date] NOT NULL,

[NextMonth_LastDay] [date] NOT NULL,

[Quarter_Name] [varchar](7) NOT NULL,

CONSTRAINT [PK_Months_1] PRIMARY KEY CLUSTERED

(

[Month_Name] ASC,

[Month_Year] ASC,

[Month_FirstDay] ASC,

[Month_LastDay] ASC,

[PreviousMonth_FirstDay] ASC,

[PreviousMonth_LastDay] ASC,

[NextMonth_FirstDay] ASC,

[NextMonth_LastDay] ASC,

[Quarter_Name] ASC

)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

) ON [PRIMARY]