• upperbognor (9/12/2008)


    I'm wondering if there is a faster solution possibly....

    It's currently taking the longest calculating the running total due to the cross join.

    Yes, there is a much faster method and, yes, the cross join is the problem because it makes a "triangular" join. Please see the following URL's for more information...

    [font="Arial Black"]Hidden RBAR: Triangular Joins[/font]

    http://www.sqlservercentral.com/articles/T-SQL/61539/

    [font="Arial Black"]Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5[/font]

    http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

    I've added the necessary changes to your test code... look for the word "change" to see the changes I've made to your code...

    --===== If the test table already exists, drop it

    IF OBJECT_ID('TempDB..#Sales','U') IS NOT NULL

    DROP TABLE #Sales

    IF OBJECT_ID('TempDB..#RunningTotal2007','U') IS NOT NULL

    DROP TABLE #RunningTotal2007

    IF OBJECT_ID('TempDB..#RunningTotal2008','U') IS NOT NULL

    DROP TABLE #RunningTotal2008

    GO

    --Create the temp table

    Create Table #Sales

    (

    House_ID Int Primary Key Clustered

    ,[2007] Decimal(18,2)

    ,[2008] Decimal(18,2)

    ,[2007 Rank] nvarchar(50)

    ,[2008 Rank] nvarchar(50)

    )

    --Insert our test data (Actual code runs on 15,000 rows +/-)

    Insert Into #Sales

    Select

    House_ID

    ,[2007]

    ,[2008]

    ,Cast(Null As nvarchar(50)) As [2007 Rank]

    ,Cast(Null As nvarchar(50)) As [2008 Rank]

    From

    (

    Select

    1 As House_ID

    ,125 As [2007]

    ,569 As [2008]

    Union All

    Select 2,212,924

    Union All

    Select 3,354,862

    Union All

    Select 4,824,724

    Union All

    Select 5,565,628

    Union All

    Select 6,618,1028

    Union All

    Select 7,784,488

    Union All

    Select 8,465,321

    Union All

    Select 9,994,284

    Union All

    Select 10,1016,161

    )A

    --declare our dividor

    Declare @Dividor As Decimal(18,4)

    --===== Declare a variable to help with the running total calculations (change)

    DECLARE @PrevTotal DECIMAL(9,2)

    --get a running total starting at the smallest number and working to the largest

    --===== Put the data we need into a temp table... note the additional columns

    SELECT IDENTITY(INT,1,1) AS RowNum,

    a.House_ID As House_ID

    ,CAST(SUM(b.[2007]) AS DECIMAL(9,2)) As HouseTotal

    ,CAST(0 AS DECIMAL(9,2)) AS Total

    INTO #RunningTotal2007

    FROM #Sales a

    INNER JOIN #Sales b ON A.House_ID = B.House_ID

    GROUP BY a.House_ID

    ORDER BY SUM(b.[2007]) ASC

    --===== Throw a clustered key on it to keep things in order during the UPDATE

    ALTER TABLE #RunningTotal2007 ADD PRIMARY KEY CLUSTERED (RowNum)

    --===== Reset the variable we need to do the running total

    SET @PrevTotal = 0

    --===== Do the running total as lightning speed

    UPDATE #RunningTotal2007

    SET @PrevTotal = Total = HouseTotal+@PrevTotal

    FROM #RunningTotal2007 WITH(INDEX(0)) --Forces clustered index scan

    --===== End of changes

    --We want 3 groups of customers, high, medium or low ones, this could also be an integer value

    --1,2,3 and converted on our output to high, medium low when we join up with our demographics.

    Set @Dividor = (Select Sum(IsNull([2007],0)) / 3 from #Sales)

    update #Sales

    Set [2007 Rank] = Case

    When isNull(Temp.Total,0) <= @Dividor Then 'Low'

    When (isNull(Temp.Total,0) > @Dividor) And (isNull(Temp.Total,0) <= @Dividor*2) then 'Medium'

    When (isNull(Temp.Total,0) > @Dividor * 2) Then 'High'

    End

    From

    (

    Select

    House_ID

    ,Total

    From

    #RunningTotal2007

    )Temp

    Where

    #Sales.house_ID = temp.house_ID

    --Calculate a running total for the next year (Change, whole section replaced)

    --===== Put the data we need into a temp table... note the additional columns

    SELECT IDENTITY(INT,1,1) AS RowNum,

    a.House_ID As House_ID

    ,CAST(SUM(b.[2008]) AS DECIMAL(9,2)) As HouseTotal

    ,CAST(0 AS DECIMAL(9,2)) AS Total

    INTO #RunningTotal2008

    FROM #Sales a

    INNER JOIN #Sales b ON A.House_ID = B.House_ID

    GROUP BY a.House_ID

    ORDER BY SUM(b.[2008]) ASC

    --===== Throw a clustered key on it to keep things in order during the UPDATE

    ALTER TABLE #RunningTotal2008 ADD PRIMARY KEY CLUSTERED (RowNum)

    --===== Reset the variable we need to do the running total

    SET @PrevTotal = 0

    --===== Do the running total as lightning speed

    UPDATE #RunningTotal2008

    SET @PrevTotal = Total = HouseTotal+@PrevTotal

    FROM #RunningTotal2008 WITH(INDEX(0)) --Forces clustered index scan

    --===== End of changes

    --Again apply the same divisor rules

    Set @Dividor = (Select Sum(IsNull([2008],0)) / 3 from #Sales)

    update #Sales

    Set [2008 Rank] = Case

    When isNull(Temp.Total,0) <= @Dividor Then 'Low'

    When (isNull(Temp.Total,0) > @Dividor) And (isNull(Temp.Total,0) <= @Dividor*2) then 'Medium'

    When (isNull(Temp.Total,0) > @Dividor * 2) Then 'High'

    End

    From

    (

    Select

    House_ID

    ,Total

    From

    #RunningTotal2008

    )Temp

    Where

    #Sales.house_ID = temp.house_ID

    --Lets have the results

    Select

    *

    From

    #Sales

    Order By

    [2008] Desc

    ... and thank you very much for posting data and code in a usable fashion. It really does help us help you faster. Well done! :):):)

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)