Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5

Question

Post reply

Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5

Viewing 15 posts - 226 through 240 (of 250 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004290 More actions · Answer 1

AnzioBake (12/22/2008)
What happens for for tables that the rows are not in a clustered index with the right order.

Like I said in the article... copy to a temp table and index that. It's still faster than using a cursor.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

AnzioBake SSCarpal Tunnel Points: 4134 More actions · Answer 2

AnzioBake

SSCarpal Tunnel

Points: 4134

December 22, 2008 at 7:00 am

#916121

Agreed not RBAR, but RBR LOL:D

LeeBear35 SSC Enthusiast Points: 124 More actions · Answer 3

As like the solution using the Anchor value, I have not seen that "fix" before. I also wondered if the clustered index was generated with a fillfactor=100 is the anchor still needed?

Also, do you think the will ever the ability to do computations in a SELECT statement like those shown on the UPDATE statement. If they did then the solution could be as simple as:

use [NorthWind]

go

DECLARE @SumFreight AS MONEY

DECLARE @CntFreight AS INT

SELECT

@SumFreight = 0.0,

@CntFreight = 0

SELECT

[OrderID],

[Freight],

@SumFreight = [RunningTotal] = @SumFreight + ISNULL([Freight], 0.0),

@CntFreight = [RunningCount] = @CntFreight + 1

FROM [dbo].[Orders]

ORDER BY

[OrderId]

RBarryYoung SSC Guru Points: 143329 More actions · Answer 4

You can do the computations in a SELECT statement, you just cannot assign to both variables and columns in a select statement.

Shame too.

[font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung [font="Arial Black"]
Proactive Performance Solutions, Inc. [/font][font="Verdana"] "Performance is our middle name."[/font]

Jeff Moden SSC Guru Points: 1004290 More actions · Answer 5

LeeBear35 (1/16/2009)
As like the solution using the Anchor value, I have not seen that "fix" before. I also wondered if the clustered index was generated with a fillfactor=100 is the anchor still needed?
Also, do you think the will ever the ability to do computations in a SELECT statement like those shown on the UPDATE statement. If they did then the solution could be as simple as:
use [NorthWind]
go
DECLARE @SumFreight AS MONEY
DECLARE @CntFreight AS INT
SELECT
@SumFreight = 0.0,
@CntFreight = 0
SELECT
[OrderID],
[Freight],
@SumFreight = [RunningTotal] = @SumFreight + ISNULL([Freight], 0.0),
@CntFreight = [RunningCount] = @CntFreight + 1
FROM [dbo].[Orders]
ORDER BY
[OrderId]

I absolutely agree... Wouldn't THAT be wonderful?! That would solve so very many problems!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Martin Stephenson SSCrazy Points: 2950 More actions · Answer 6

Great article, I 've been able to apply it to a pain in the neck I've been working on, but at the risk of looking like a thicko --:D I cant get the order by update to work with a non clustered index

If I run the following code

--===== Add the primary key

ALTER TABLE dbo.JBMTest

ADD PRIMARY KEY CLUSTERED (RowNum) --clustered for "Merry-go-Round" test

--===== Add the "sorting index" to the table

CREATE NONCLUSTERED INDEX IX_JBMTest_AccountID_Date --not clustered for "Merry-go-Round" test

ON dbo.JBMTest (AccountID, Date)

--===== Build the table 100 rows at a time to "mix things up"

DECLARE @Counter INT

SET @Counter = 0

WHILE @Counter < 1000000

BEGIN --===== Add 1000 rows to the test table

INSERT INTO dbo.JBMTest

(AccountID, Amount, Date)

SELECT TOP 100

AccountID = ABS(CHECKSUM(NEWID()))%50000+1,

Amount = CAST(CHECKSUM(NEWID())%10000 /100.0 AS MONEY),

Date = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME)

FROM Master.dbo.SysColumns t1

CROSS JOIN Master.dbo.SysColumns t2

--===== Increment the counter

SET @Counter = @Counter + 100

END

--===== Declare the required variables

DECLARE @PrevBal MONEY

SET @PrevBal = 0

DECLARE @AccountID INT

SET @AccountID = 0

--===== Calculate the running total using SQL Server's proprietary update

-- and force the order of the update with and ORDER BY

UPDATE t1

SET @PrevBal = RunBal = t1.Amount + @PrevBal, @AccountID = t1.AccountID --This does nothing but provide an "anchor"

FROM dbo.JBMTest t1 INNER JOIN (--==== Derived table "d" provides the sort order for the update

SELECT TOP 100 PERCENT rownum,AccountID, Date

FROM dbo.JBMTest

ORDER BY AccountID, Date )d ON t1.rownum = d.rownum

I get a very quick update based on rownum order, I 'va attached the execution plan

I looked at this and thought it was giving me the wrong results because it is scanning the clustered PK on both tables before the join and therfore is getting the rownum order so I changed the join to

'ON t1.accountID = d.accountID and t1.date = d.date'

and the query plan now shows a scan on the index of the subquery

and gives me the right results,

apart from one mention on the first page of the discussion 200 or so posts ago no one else seems to have had this problem, or am I missing the point of the merry go round index

:unsure:

Jeff Moden SSC Guru Points: 1004290 More actions · Answer 7

Thanks for the detailed explanation... I'll take a look tonight after I get home.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

LeeBear35 SSC Enthusiast Points: 124 More actions · Answer 8

I have encountered problems with Merry Go Round issues too. On small samples it would always work but as data sets grew it was more and more of an issue then I realized that even with a clustered index if you have an 80% fill factor the page is only filled with the first 6K of data then goes to the next page. This happens until all the currently allocated pages in the table are are filled to 80%, then the remaining 20% is filled then a new set of pages are allocated.

I have often found a clustered index with a WITH Fillfactor=100 prevents this issue, but does not always resolve the problem.

Happy hunting...

Jeff Moden SSC Guru Points: 1004290 More actions · Answer 9

Martin Stephenson (1/30/2009)
Great article, I 've been able to apply it to a pain in the neck I've been working on, but at the risk of looking like a thicko --:D I cant get the order by update to work with a non clustered index...
... I looked at this and thought it was giving me the wrong results because it is scanning the clustered PK on both tables before the join and therfore is getting the rownum order so I changed the join to
'ON t1.accountID = d.accountID and t1.date = d.date'
and the query plan now shows a scan on the index of the subquery
and gives me the right results,
apart from one mention on the first page of the discussion 200 or so posts ago no one else seems to have had this problem, or am I missing the point of the merry go round index
:unsure:

Oh my my my [font="Arial Black"]MY![/font] The irony of it all! :D:P:hehe: In the words of a very wise man, "The Lord giveth, and Microsoft taketh away! 😉

Everyone kept busting my hump about how the "quirky" update was undocumented and could "change at the next service pack" and insisted that I use an ORDER BY, which I did. Here's the really funny part... the ORDER BY update used to work just fine when I wrote this article. I hadn't yet installed SQL Server 2005. I was still working on just SQL Server 2000 sp3a (I didn't trust sp4 so never installed it) and it worked just fine... then, I never used the ORDER BY method ever again because, frankly, it was too slow for me. Since that time, I've installed SQL Server 2005 and sp2... now, NONE of the original ORDER BY code works on my box either in 2k or 2k5 with or without the additional index! Even the mods you put on the code don't work on my box even though you say it does on your box. (I wouldn't use the ORDER BY method if I were you :))

The irony is that such a well documented feature as ORDER BY changed without warning and the "undocumented" feature keeps right on working as advertised.

I'll make a note of this at the start of the thread so people are made aware. It also gives me incentive to expedite the rewrite I'm currently working on for the article.

Thanks for bringing this to my attention!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1004290 More actions · Answer 10

LeeBear35 (1/30/2009)
I have encountered problems with Merry Go Round issues too. On small samples it would always work but as data sets grew it was more and more of an issue then I realized that even with a clustered index if you have an 80% fill factor the page is only filled with the first 6K of data then goes to the next page. This happens until all the currently allocated pages in the table are are filled to 80%, then the remaining 20% is filled then a new set of pages are allocated.
I have often found a clustered index with a WITH Fillfactor=100 prevents this issue, but does not always resolve the problem.
Happy hunting...

Please see the post just above this one for an "explanation" of what happened.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Allain Bøge Valued Member Points: 69 More actions · Answer 11

YES YES, it is working, had to make some minor alterations due to an extra dimension (dataareaid)

Thanks a lot for the quick reply and the correct answer.

I went from 4.5 hours to 7 !!!! seconds, talk about an improvement. I added my alteration in case any can use it again

Thanks again, and specially thanks to Jeff Moden witch is the originator of the code

Kind regards

ABB

-- Original code from Jeff Moden alterated by ABB

DECLARE @PrevCompany varchar(3)

SET @PrevCompany = ''

DECLARE @PrevCompanyBalance MONEY --running total for each company

SET @PrevCompanyBalance = 0

DECLARE @PrevGrpBal MONEY --Running total resets when account changes

SET @PrevGrpBal = 0

DECLARE @PrevRunCnt INT --Overall running count (ordinal rank)

SET @PrevRunCnt = 0

DECLARE @PrevGrpCnt INT --Running count resets when account changes

SET @PrevGrpCnt = 0

DECLARE @PrevAcctID varchar(10) --The "account change detector"

SET @PrevAcctID = ''

UPDATE dbo.DW_CustTrans

SET --===== Running Total Company

@PrevCompanyBalance = CompanyBalance = CASE

WHEN dataareaid = @Prevcompany

THEN @PrevCompanyBalance + Amount

ELSE Amount

END,

--===== Account Running Total (Reset when account changes)

@PrevGrpBal = Balance_MST = CASE

WHEN accountnumber = @PrevAcctID

THEN @PrevGrpBal + Amount

ELSE Amount -- Restarts total at "0 + current amount"

END,

--===== Running Count (Ordinal Rank)

@PrevRunCnt = RunCnt = @PrevRunCnt + 1,

--===== Account Running Total (Ordinal Rank, Reset when account changes)

@PrevGrpCnt = CompanyCount = CASE

WHEN accountnumber = @PrevAcctID

THEN @PrevGrpCnt + 1

ELSE 1 -- Restarts count at "1"

END,

--===== "Anchor" and provides for "account and Company change detection"

@PrevAcctID = accountnumber,

@prevCompany = dataareaid

FROM dbo.DW_CustTrans WITH (INDEX(IX_DW_CustTrans),TABLOCKX)

Jeff Moden SSC Guru Points: 1004290 More actions · Answer 12

Ummm.... no.... to be as safe as possible with this method, you MUST include a FROM clause in the update and it must use a TabLockX and index hint that forces the use of the clustered index.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 13

Jeff, Do you have access to SQL Server 2008? At the SSWUGvConference I heard the the ORDER BY "may be" honored again in the UPDATE statement. I don't have SQL Server 2008 yet so I can't test it.

Jeff Moden SSC Guru Points: 1004290 More actions · Answer 14

Lynn Pettis (4/24/2009)
Jeff, Do you have access to SQL Server 2008? At the SSWUGvConference I heard the the ORDER BY "may be" honored again in the UPDATE statement. I don't have SQL Server 2008 yet so I can't test it.

That's good news, but, no, I don't have 2k8, yet.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 15

Jeff Moden (4/24/2009)
Lynn Pettis (4/24/2009)
Jeff, Do you have access to SQL Server 2008? At the SSWUGvConference I heard the the ORDER BY "may be" honored again in the UPDATE statement. I don't have SQL Server 2008 yet so I can't test it.
That's good news, but, no, I don't have 2k8, yet.

Well, it isn't good news until it is proven true. I'd call it hopeful news.