Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Customer Ranking


Customer Ranking

Author
Message
upperbognor
upperbognor
SSC Journeyman
SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)

Group: General Forum Members
Points: 95 Visits: 936
I'm required to rank our customers based on total purchases into 3 groups (1-high, 2-medium, 3-low) valued customers, using the total purchasing power of all customers by year. By dividing the Sum of total sales by year into 3 groups and then allocating customers to these groups. Generally speaking, the low value group has more customers then the medium, the medium more then the high because it takes less of them to get to 1/3 the total sales. My current solution takes 1 minute 9 sec to run on approx 9600 customers and I'm wondering if there is a faster solution possibly using CTE and the built in Rank functions. It's currently taking the longest calculating the running total due to the cross join.

This will allow us to see changes in customer value, e.g. a low value customer has become a high value one, or a high value has become a medium value customer.

I've included my current solution in with the test data.

--===== If the test table already exists, drop it
IF OBJECT_ID('TempDB..#Sales','U') IS NOT NULL
DROP TABLE #Sales
IF OBJECT_ID('TempDB..#RunningTotal2007','U') IS NOT NULL
DROP TABLE #RunningTotal2007
IF OBJECT_ID('TempDB..#RunningTotal2008','U') IS NOT NULL
DROP TABLE #RunningTotal2008

--Create the temp table
Create Table #Sales
(
House_ID Int Primary Key Clustered
,[2007] Decimal(18,2)
,[2008] Decimal(18,2)
,[2007 Rank] nvarchar(50)
,[2008 Rank] nvarchar(50)
)

--Insert our test data (Actual code runs on 15,000 rows +/-)
Insert Into #Sales
Select
House_ID
,[2007]
,[2008]
,Cast(Null As nvarchar(50)) As [2007 Rank]
,Cast(Null As nvarchar(50)) As [2008 Rank]
From
(
Select
1 As House_ID
,125 As [2007]
,569 As [2008]
Union All
Select 2,212,924
Union All
Select 3,354,862
Union All
Select 4,824,724
Union All
Select 5,565,628
Union All
Select 6,618,1028
Union All
Select 7,784,488
Union All
Select 8,465,321
Union All
Select 9,994,284
Union All
Select 10,1016,161
)A


--declare our dividor
Declare @Dividor As Decimal(18,4)

--get a running total starting at the smallest number and working to the largest
Select
a.House_ID As House_ID
,Sum(b.[2007]) As Total
Into
#RunningTotal2007
From
#Sales a
Cross Join #Sales b
Where
IsNull(b.[2007],0) <= IsNull(a.[2007],0)
Group By
a.House_ID

--We want 3 groups of customers, high, medium or low ones, this could also be an integer value
--1,2,3 and converted on our output to high, medium low when we join up with our demographics.
Set @Dividor = (Select Sum(IsNull([2007],0)) / 3 from #Sales)
update #Sales
Set [2007 Rank] = Case
When isNull(Temp.Total,0) <= @Dividor Then 'Low'
When (isNull(Temp.Total,0) > @Dividor) And (isNull(Temp.Total,0) <= @Dividor*2) then 'Medium'
When (isNull(Temp.Total,0) > @Dividor * 2) Then 'High'
End
From
(
Select
House_ID
,Total
From
#RunningTotal2007
)Temp
Where
#Sales.house_ID = temp.house_ID

--Calculate a running total for the next year
Select
a.House_ID As House_ID
,Sum(b.[2008]) As Total
Into
#RunningTotal2008
From
#Sales a
Cross Join #Sales b
Where
IsNull(b.[2008],0) <= IsNull(a.[2008],0)
Group By
a.House_ID

--Again apply the same divisor rules
Set @Dividor = (Select Sum(IsNull([2008],0)) / 3 from #Sales)
update #Sales
Set [2008 Rank] = Case
When isNull(Temp.Total,0) <= @Dividor Then 'Low'
When (isNull(Temp.Total,0) > @Dividor) And (isNull(Temp.Total,0) <= @Dividor*2) then 'Medium'
When (isNull(Temp.Total,0) > @Dividor * 2) Then 'High'
End
From
(
Select
House_ID
,Total
From
#RunningTotal2008
)Temp
Where
#Sales.house_ID = temp.house_ID


--Lets have the results
Select
*
From
#Sales
Order By
[2008] Desc




Fraggle-805517
Fraggle-805517
Right there with Babe
Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)

Group: General Forum Members
Points: 733 Visits: 1510
Have you tried using the NTILE function. Really designed to do just this type of stuff. Run this after inserting into the data into the #Sales table.




select House_ID, [2007], Ntile(3) over (order by [2007] desc) as '2007 Ranking', [2008], NTILE(3) over (order by [2008] desc) as '2008 Ranking'
from #sales




The ranking numbers here don't match perfectly with yours, but they are close and probably more accurage. Also, the execution plan I saw shows that it should run faster and with a lot less coding. Let me know how it works.

Fraggle
upperbognor
upperbognor
SSC Journeyman
SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)

Group: General Forum Members
Points: 95 Visits: 936
I thought about that, the problem is it doesn't divide the Sum([2008]) for example into 3 approximately even groups. If you look at the output from the query you provided 59% of the sales dollars have a rank of 1, 28% have a rank of 2, and 12.8% have a rank of 3. I'm look more for a result of 33.3% for each group +/- a few %.

upperbognor
Fraggle-805517
Fraggle-805517
Right there with Babe
Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)

Group: General Forum Members
Points: 733 Visits: 1510
Those are not the results I get. I get 40% in ranked 1, 30% ranked 2, and 30% ranked 3 for the 10 Rows. So this is roughtly 33% each and probably would be if there was 12 rows.

Here is my output.

House_ID 2007 2007 Ranking 2008 2008 Ranking
----------- --------------------------------------- -------------------- --------------------------------------- --------------------
6 618.00 2 1028.00 1
2 212.00 3 924.00 1
3 354.00 3 862.00 1
4 824.00 1 724.00 1
5 565.00 2 628.00 2
1 125.00 3 569.00 2
7 784.00 1 488.00 2
8 465.00 2 321.00 3
9 994.00 1 284.00 3
10 1016.00 1 161.00 3

(10 row(s) affected)



Fraggle
upperbognor
upperbognor
SSC Journeyman
SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)

Group: General Forum Members
Points: 95 Visits: 936
This would be my ideal ranking. Using the 2008 year as an example

Select 6,1028,'High' union all
Select 2,924,'High' union all
Select 3,862,'Medium' union all
Select 4,724,'Medium' union all
Select 5,628,'Medium' union all
Select 1,569,'Low' union all
Select 7,488,'Low' union all
Select 8,321,'Low' union all
Select 9,284,'Low' union all
Select 10,161,'Low'


In this case it breaks down to 32.5% high, 36.9% medium, 30.4% low.
upperbognor
upperbognor
SSC Journeyman
SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)

Group: General Forum Members
Points: 95 Visits: 936
Ah, I think see the confusion. I need the distibution by sales not by the number of customers in each group. See my post above for it. It's not the number of customers but the sum of sales for each group that needs to distribute into 1/3rds.

thank you very much for the assistance you've proviced so far Fraggle!

Upperbognor
Fraggle-805517
Fraggle-805517
Right there with Babe
Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)Right there with Babe (733 reputation)

Group: General Forum Members
Points: 733 Visits: 1510


This would be my ideal ranking. Using the 2008 year as an example

Select 6,1028,'High' union all
Select 2,924,'High' union all
Select 3,862,'Medium' union all
Select 4,724,'Medium' union all
Select 5,628,'Medium' union all
Select 1,569,'Low' union all
Select 7,488,'Low' union all
Select 8,321,'Low' union all
Select 9,284,'Low' union all
Select 10,161,'Low'

In this case it breaks down to 32.5% high, 36.9% medium, 30.4% low.



You will have to forgive me, but if you have 2 rankings of 'High' in a set of 10, how does that equal 32.5%? My maths says that is equal to 20% (2 rows / 10 total rows).

The NTILE example I gave you is grouping by sales.
upperbognor
upperbognor
SSC Journeyman
SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)SSC Journeyman (95 reputation)

Group: General Forum Members
Points: 95 Visits: 936
Maybe this can show the percentages better.

High 1952
% Of Total High 32.593
Medium 2214
% Of Total Medium 36.9677
Low 1823
% Of Total Low 30.4391
Total 5989




Select
Sum([High]) As High
,Sum(High)/ (Sum(High) + Sum(Medium) + Sum(Low)) * 100 As [% Of Total High]
,Sum([Medium]) As Medium
,Sum(Medium)/ (Sum(High) + Sum(Medium) + Sum(Low)) * 100 As [% Of Total High]
,Sum([Low]) As Low
,Sum(Low)/ (Sum(High) + Sum(Medium) + Sum(Low)) * 100 As [% Of Total High]
,(Sum(High) + Sum(Medium) + Sum(Low)) As [Total]
From
(
Select 6 As house_ID,Cast(1028 As Decimal(18,2)) As [2008],'High' As [Rank] union all
Select 2,924,'High' union all
Select 3,862,'Medium' union all
Select 4,724,'Medium' union all
Select 5,628,'Medium' union all
Select 1,569,'Low' union all
Select 7,488,'Low' union all
Select 8,321,'Low' union all
Select 9,284,'Low' union all
Select 10,161,'Low'
)A
Pivot
(Sum([2008])
For Rank in ([High],[Medium],[Low]))B



Matt Miller (#4)
Matt Miller (#4)
SSCertifiable
SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)SSCertifiable (7.7K reputation)

Group: General Forum Members
Points: 7668 Visits: 18111
Sounds messy. From what I'm seeing - you'd have to do a running total (running from highest to lowest.) which you divide by the grand total to come up with a "running percentage". And then - make some kind of business rule as to what belongs in which tier.

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45516 Visits: 39948
upperbognor (9/12/2008)
I'm wondering if there is a faster solution possibly....
It's currently taking the longest calculating the running total due to the cross join.

Yes, there is a much faster method and, yes, the cross join is the problem because it makes a "triangular" join. Please see the following URL's for more information...

Hidden RBAR: Triangular Joins
http://www.sqlservercentral.com/articles/T-SQL/61539/

Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5
http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

I've added the necessary changes to your test code... look for the word "change" to see the changes I've made to your code...

--===== If the test table already exists, drop it     
IF OBJECT_ID('TempDB..#Sales','U') IS NOT NULL
DROP TABLE #Sales
IF OBJECT_ID('TempDB..#RunningTotal2007','U') IS NOT NULL
DROP TABLE #RunningTotal2007
IF OBJECT_ID('TempDB..#RunningTotal2008','U') IS NOT NULL
DROP TABLE #RunningTotal2008
GO
--Create the temp table
Create Table #Sales
(
House_ID Int Primary Key Clustered
,[2007] Decimal(18,2)
,[2008] Decimal(18,2)
,[2007 Rank] nvarchar(50)
,[2008 Rank] nvarchar(50)
)

--Insert our test data (Actual code runs on 15,000 rows +/-)
Insert Into #Sales
Select
House_ID
,[2007]
,[2008]
,Cast(Null As nvarchar(50)) As [2007 Rank]
,Cast(Null As nvarchar(50)) As [2008 Rank]
From
(
Select
1 As House_ID
,125 As [2007]
,569 As [2008]
Union All
Select 2,212,924
Union All
Select 3,354,862
Union All
Select 4,824,724
Union All
Select 5,565,628
Union All
Select 6,618,1028
Union All
Select 7,784,488
Union All
Select 8,465,321
Union All
Select 9,994,284
Union All
Select 10,1016,161
)A


--declare our dividor
Declare @Dividor As Decimal(18,4)

--===== Declare a variable to help with the running total calculations (change)
DECLARE @PrevTotal DECIMAL(9,2)

--get a running total starting at the smallest number and working to the largest
--===== Put the data we need into a temp table... note the additional columns
SELECT IDENTITY(INT,1,1) AS RowNum,
a.House_ID As House_ID
,CAST(SUM(b.[2007]) AS DECIMAL(9,2)) As HouseTotal
,CAST(0 AS DECIMAL(9,2)) AS Total
INTO #RunningTotal2007
FROM #Sales a
INNER JOIN #Sales b ON A.House_ID = B.House_ID
GROUP BY a.House_ID
ORDER BY SUM(b.[2007]) ASC

--===== Throw a clustered key on it to keep things in order during the UPDATE
ALTER TABLE #RunningTotal2007 ADD PRIMARY KEY CLUSTERED (RowNum)

--===== Reset the variable we need to do the running total
SET @PrevTotal = 0

--===== Do the running total as lightning speed
UPDATE #RunningTotal2007
SET @PrevTotal = Total = HouseTotal+@PrevTotal
FROM #RunningTotal2007 WITH(INDEX(0)) --Forces clustered index scan

--===== End of changes


--We want 3 groups of customers, high, medium or low ones, this could also be an integer value
--1,2,3 and converted on our output to high, medium low when we join up with our demographics.
Set @Dividor = (Select Sum(IsNull([2007],0)) / 3 from #Sales)
update #Sales
Set [2007 Rank] = Case
When isNull(Temp.Total,0) <= @Dividor Then 'Low'
When (isNull(Temp.Total,0) > @Dividor) And (isNull(Temp.Total,0) <= @Dividor*2) then 'Medium'
When (isNull(Temp.Total,0) > @Dividor * 2) Then 'High'
End
From
(
Select
House_ID
,Total
From
#RunningTotal2007
)Temp
Where
#Sales.house_ID = temp.house_ID

--Calculate a running total for the next year (Change, whole section replaced)
--===== Put the data we need into a temp table... note the additional columns
SELECT IDENTITY(INT,1,1) AS RowNum,
a.House_ID As House_ID
,CAST(SUM(b.[2008]) AS DECIMAL(9,2)) As HouseTotal
,CAST(0 AS DECIMAL(9,2)) AS Total
INTO #RunningTotal2008
FROM #Sales a
INNER JOIN #Sales b ON A.House_ID = B.House_ID
GROUP BY a.House_ID
ORDER BY SUM(b.[2008]) ASC

--===== Throw a clustered key on it to keep things in order during the UPDATE
ALTER TABLE #RunningTotal2008 ADD PRIMARY KEY CLUSTERED (RowNum)

--===== Reset the variable we need to do the running total
SET @PrevTotal = 0

--===== Do the running total as lightning speed
UPDATE #RunningTotal2008
SET @PrevTotal = Total = HouseTotal+@PrevTotal
FROM #RunningTotal2008 WITH(INDEX(0)) --Forces clustered index scan

--===== End of changes

--Again apply the same divisor rules
Set @Dividor = (Select Sum(IsNull([2008],0)) / 3 from #Sales)
update #Sales
Set [2008 Rank] = Case
When isNull(Temp.Total,0) <= @Dividor Then 'Low'
When (isNull(Temp.Total,0) > @Dividor) And (isNull(Temp.Total,0) <= @Dividor*2) then 'Medium'
When (isNull(Temp.Total,0) > @Dividor * 2) Then 'High'
End
From
(
Select
House_ID
,Total
From
#RunningTotal2008
)Temp
Where
#Sales.house_ID = temp.house_ID


--Lets have the results
Select
*
From
#Sales
Order By
[2008] Desc



... and thank you very much for posting data and code in a usable fashion. It really does help us help you faster. Well done! SmileSmileSmile

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search