Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Solving the "Running Total" & "Ordinal Rank" Problems (Rewritten)


Solving the "Running Total" & "Ordinal Rank" Problems (Rewritten)

Author
Message
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44871 Visits: 39851
ben.mcintyre (11/10/2009)
Superlative. As usual, you make a highly complex and conditional task look simple and straightforward by precisely covering key points. I am now not afraid to try using this method :-)
Any time frame on your part 2 of pivot tables article Jeff ? I was thinking of drafting up a complementary article but wanted to leave you to have a complete say first ...


Thanks for the compliment, Ben... especially on such a controversial method.

Shifting gears and thank you for your interest, Part 2 of the Pivot tables article was actually published a while back and can be found at the following URL...

http://www.sqlservercentral.com/articles/Crosstab/65048/

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44871 Visits: 39851
dm_unseen-571460 (11/10/2009)
I'm missig the CTE/Recursive approaches to solving these issues. While usually not single pass the beat triangular joins

For Running total see: http://shahamishm.blogspot.com/2009/04/how-to-do-running-sum-using-cte-sql.html

For finding previous row ordered by key_nr and trans_date(clustered key):

WITH trans_cte(key_nr, trans_date,field1,field2,row_order)
AS(
SELECT
key_nr
,trans_date
,field1
,field2
,ROW_NUMBER()OVER(PARTITIONBY key_nr ORDERBY trans_date ASC) row_order
FROM [dbo].[trans_stage_table]
)
SELECT BASE.key_nr
,BASE.trans_date
,BASE.field1
,BASE.field2
FROM trans_cte BASE
LEFTJOIN trans_cte LAG ON BASE.key_nr = LAG.key_nr
AND BASE.row_order = LAG.row_order+1
and BASE.field1 = LAG.field1
and BASE.field2 = LAG.field2
where LAG.key_nr isnull


Thank you for the link to your article where a running total is resolved by a recurrsive CTE. That's one of the methods I left out due to the shear length (22 pages in Word) of this article. Please take the following constructively... it's not meant as a slam, just an observation. What I believe you'll find is that it's performance is similar to or worse than a cursor. But, before I stick my foot too far into my mouth, I need to test it against the million row test table I created for this article. I'll be back (hopefully tonight) to let you know how that went. Of course, I'll post the code I tested with so that you and others can also run the same test.

Thank you for your interest in solving the running total problem.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44871 Visits: 39851
ta.bu.shi.da.yu (11/10/2009)
That's an awesome article Jeff!

Now if only I could work out how to run a query that traverses a directed graph with cycles - but where there is a node that connects to another node that had already been traversed then it does not continue. ..

But rather than muck up this thread, I'll post a question about it a bit later. :-)


Thanks for the great feedback, ta.bu.shi.da.yu. Always appreciated.

On the subject of "a query that traverses a directed graph with cycles", I'd very much appreciate you posting a question on the forum about that especially since I'm not sure what you mean there. Of course, I've only had one cup of coffee so far this morning. :-P Looking forward to the question and the explanation as to what you mean on the subject.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44871 Visits: 39851
Hugo Kornelis (11/10/2009)
Hi Jeff,

Thanks for doing a rewrite of this article.

Just a few comments before I start really reading the article:

1) Being one of the people who tried to grill you over a small fire for your recommendation to use what you call quirky update, I feel olbliged to point out that, no matter how many tests you run succesfully and how many rules you add to rule out problem cases, using undocumented features will always remain a huge risk. Of course, you are right that you can't prove reliability of SELECT either. But if I ever encounter some weird case where SELECT does not behave as expected and as documented, I can file a bug with Microsoft and have it fixed. If I encounter a case where the quirky update fails to work, I can try to file a bug but I should expect the MS guys to do the told-you-so-dance.

2) The code in figure 2 will blow up on any server with a case sensitive collation. You should change "TempDB" to "tempdb".

That's it for now. I'll now start properly reading the article.


Heh... thanks Hugo. Despite our differences on several of those subjects, you've always been a gentleman in your criticism and observations. Never any harm done there and I very much appreciate it.

On the subject of using undocumented features, I agree... you can't call Microsoft on them. If they ever change the fundamental code behind the UPDATE statement so the "Quirky Update" doesn't work anymore, then we'll all have to resort to a Cursor, While Loop, or a CLR. Hopefully MS will have a properly working SUM() windowing function by then, as well. Heh... the good part is... they're done modifying 2k and 2k5 so I don't believe we have anything to worry about there. :-)

On your item 2, agreed... I don't use any case sensitivity except by column so I tend to overlook the folks that have an entire server set to be case sensitive. Hmmmm... perhaps that would make a decent "Friday Poll"... "Question 1. Do you have any case sensitive servers? Question 2. Either way, please discuss why or why not." I think it would be a great discussion.

Again, even though we tend to disagree on a couple of subjects, I always appreciate the very thoughtful manner in which you discuss those things. Thanks, Hugo.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Hugo Kornelis
Hugo Kornelis
SSCrazy Eights
SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)SSCrazy Eights (8.3K reputation)

Group: General Forum Members
Points: 8295 Visits: 11538
Thanks Jeff.

I also found a TEMPDB (fully upper case) and a Master when I ran the code to create the million row table. If you have a chance, you might want to change that.

I always run on case sensitive servers so that I know for sure all code I write will work on all servers. (Okay, in theory I could mess up by using object names that only differ in case, but I am not THAT thick).

I don't think cursor, WHILE loop, CLR and RBAR are the only documented and guaranteed alternatives. If you have the SQL Server MVP Deep Dives book, you'll find it in my article on set-based iteration (chapter 4). If you don't have it, shame on you! All royalties go to a great charity so you ought to have at least three copies! :-P Anyway, in this case you'll have to wait until I find the time to adapt my method to your test data, test its performance, and write a longer reply.


Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Alexander Kuznetsov
Alexander Kuznetsov
SSC Veteran
SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)

Group: General Forum Members
Points: 243 Visits: 824
These are incredibly smart techniques, but after playing with such approaches I decided that I don't want to put it into production - they are just a little bit too complex and they do not look robust to me. In my book this is one of those cases where denormalization allows for much faster and significantly simpler solutions:

[url=http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/01/23/denormalizing-to-enforce-business-rules-running-totals.aspx][/url]

BTW, Adam Machanic recommends CLR cursors in such cases...
TravisDBA
TravisDBA
UDP Broadcaster
UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)

Group: General Forum Members
Points: 1460 Visits: 3069
Alexander Kuznetsov-291390 (11/10/2009)
I decided that I don't want to put it into production - they are just a little bit too complex and they do not look robust to me. In my book this is one of those cases where denormalization allows for much faster and significantly simpler solutions


Probably a very smart decision in the long run Alex. I have been in that situation many times before where I discover something really kind of neat in SQl Server, but after some serious deliberation and some serious testing as well, I finally realized that discretion is the better part of valor and decided to just leave it out of my production environment. :-P

"Technology is a weird thing. It brings you great gifts with one hand, and it stabs you in the back with the other. ...:-D"
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44871 Visits: 39851
Alexander Kuznetsov-291390 (11/10/2009)
These are incredibly smart techniques, but after playing with such approaches I decided that I don't want to put it into production - they are just a little bit too complex and they do not look robust to me. In my book this is one of those cases where denormalization allows for much faster and significantly simpler solutions:

[url=http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/01/23/denormalizing-to-enforce-business-rules-running-totals.aspx][/url]

BTW, Adam Machanic recommends CLR cursors in such cases...


First, thanks for taking the time to stop by and post your thoughts, Alexander. I really appreciate it.

I'm not sure why you find the "Quirky Update" a bit too complex nor why you don't think they look robust, but that's OK. Your good method will nicely support ongoing running balance updates on insertion of new data, albeit, in a RBAR manner. I'll add some performance testing using your good method to my list of things to do.

To be sure, your method is very clever... I like it. I just think it will be a bit slow because of it's RBAR nature.

And, yes, I know Adam recommends using a cursor for this... so do a lot of other good folks. If you don't trust the "Quirky Update" you can do as I suggested in the article... use verification code to verify it worked correctly or do like Adam and others suggest.... use a cursor and tolerate the performance and resource usage hit on large batches. ;-)

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)SSC-Forever (44K reputation)

Group: General Forum Members
Points: 44871 Visits: 39851
talltop-969015 (11/10/2009)
Alexander Kuznetsov-291390 (11/10/2009)
I decided that I don't want to put it into production - they are just a little bit too complex and they do not look robust to me. In my book this is one of those cases where denormalization allows for much faster and significantly simpler solutions


Probably a very smart decision in the long run Alex. I have been in that situation many times before where I discover something really kind of neat in SQl Server, but after some serious deliberation and some serious testing as well, I finally realized that discretion is the better part of valor and decided to just leave it out of my production environment. :-P


Yep... that's why I included the Cursor code. If you don't trust in the "Quirky Update" (even though no one has been able to break a properly written one since the first article came out), then do as I said in the article.... use a Cursor, CLR, or some other method you deem satisfactory. I do, however, believe that you're missing out on a very powerful tool. Running totals aren't the only thing this method can be used for.

Either way, thanks for stopping by and taking the time to write a bit of feedback. I appreciate it.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
     Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is usually not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Alexander Kuznetsov
Alexander Kuznetsov
SSC Veteran
SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)SSC Veteran (243 reputation)

Group: General Forum Members
Points: 243 Visits: 824
Jeff Moden (11/10/2009)
Alexander Kuznetsov-291390 (11/10/2009)
These are incredibly smart techniques, but after playing with such approaches I decided that I don't want to put it into production - they are just a little bit too complex and they do not look robust to me. In my book this is one of those cases where denormalization allows for much faster and significantly simpler solutions:

[url=http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/01/23/denormalizing-to-enforce-business-rules-running-totals.aspx][/url]

BTW, Adam Machanic recommends CLR cursors in such cases...


First, thanks for taking the time to stop by and post your thoughts, Alexander. I really appreciate it.

I'm not sure why you find the "Quirky Update" a bit too complex nor why you don't think the look robust, but that's OK. Your good method will nicely support ongoing running balance updates on insertion of new data, albeit, in a RBAR manner. I'll add some performance testing using your good method to my list of things to do.

To be sure, your method is very clever... I like it. I just think it will be a bit slow because of it's RBAR nature.



Jeff, I am not sure how RBAR is relevant to denormalization, yet when the running totals are stored right in your row, you don't need any joins, any cursors, anything - selects are as fast as it goes. For inserts, you retrieve only one previous row, so your inserts do not slow down as your table grows...
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search