Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Solving the "Running Total" & "Ordinal Rank" Problems (Rewritten) Expand / Collapse
Author
Message
Posted Tuesday, November 10, 2009 6:37 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:58 PM
Points: 36,760, Visits: 31,215
ben.mcintyre (11/10/2009)
Superlative. As usual, you make a highly complex and conditional task look simple and straightforward by precisely covering key points. I am now not afraid to try using this method
Any time frame on your part 2 of pivot tables article Jeff ? I was thinking of drafting up a complementary article but wanted to leave you to have a complete say first ...


Thanks for the compliment, Ben... especially on such a controversial method.

Shifting gears and thank you for your interest, Part 2 of the Pivot tables article was actually published a while back and can be found at the following URL...

http://www.sqlservercentral.com/articles/Crosstab/65048/


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #816442
Posted Tuesday, November 10, 2009 6:43 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:58 PM
Points: 36,760, Visits: 31,215
dm_unseen-571460 (11/10/2009)
I'm missig the CTE/Recursive approaches to solving these issues. While usually not single pass the beat triangular joins

For Running total see: http://shahamishm.blogspot.com/2009/04/how-to-do-running-sum-using-cte-sql.html

For finding previous row ordered by key_nr and trans_date(clustered key):

WITH trans_cte(key_nr, trans_date,field1,field2,row_order)
AS(
SELECT
key_nr
,trans_date
,field1
,field2
,ROW_NUMBER()OVER(PARTITIONBY key_nr ORDERBY trans_date ASC) row_order
FROM [dbo].[trans_stage_table]
)
SELECT BASE.key_nr
,BASE.trans_date
,BASE.field1
,BASE.field2
FROM trans_cte BASE
LEFTJOIN trans_cte LAG ON BASE.key_nr = LAG.key_nr
AND BASE.row_order = LAG.row_order+1
and BASE.field1 = LAG.field1
and BASE.field2 = LAG.field2
where LAG.key_nr isnull


Thank you for the link to your article where a running total is resolved by a recurrsive CTE. That's one of the methods I left out due to the shear length (22 pages in Word) of this article. Please take the following constructively... it's not meant as a slam, just an observation. What I believe you'll find is that it's performance is similar to or worse than a cursor. But, before I stick my foot too far into my mouth, I need to test it against the million row test table I created for this article. I'll be back (hopefully tonight) to let you know how that went. Of course, I'll post the code I tested with so that you and others can also run the same test.

Thank you for your interest in solving the running total problem.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #816447
Posted Tuesday, November 10, 2009 6:46 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:58 PM
Points: 36,760, Visits: 31,215
ta.bu.shi.da.yu (11/10/2009)
That's an awesome article Jeff!

Now if only I could work out how to run a query that traverses a directed graph with cycles - but where there is a node that connects to another node that had already been traversed then it does not continue. ..

But rather than muck up this thread, I'll post a question about it a bit later.


Thanks for the great feedback, ta.bu.shi.da.yu. Always appreciated.

On the subject of "a query that traverses a directed graph with cycles", I'd very much appreciate you posting a question on the forum about that especially since I'm not sure what you mean there. Of course, I've only had one cup of coffee so far this morning. Looking forward to the question and the explanation as to what you mean on the subject.




--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #816451
Posted Tuesday, November 10, 2009 6:54 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:58 PM
Points: 36,760, Visits: 31,215
Hugo Kornelis (11/10/2009)
Hi Jeff,

Thanks for doing a rewrite of this article.

Just a few comments before I start really reading the article:

1) Being one of the people who tried to grill you over a small fire for your recommendation to use what you call quirky update, I feel olbliged to point out that, no matter how many tests you run succesfully and how many rules you add to rule out problem cases, using undocumented features will always remain a huge risk. Of course, you are right that you can't prove reliability of SELECT either. But if I ever encounter some weird case where SELECT does not behave as expected and as documented, I can file a bug with Microsoft and have it fixed. If I encounter a case where the quirky update fails to work, I can try to file a bug but I should expect the MS guys to do the told-you-so-dance.

2) The code in figure 2 will blow up on any server with a case sensitive collation. You should change "TempDB" to "tempdb".

That's it for now. I'll now start properly reading the article.


Heh... thanks Hugo. Despite our differences on several of those subjects, you've always been a gentleman in your criticism and observations. Never any harm done there and I very much appreciate it.

On the subject of using undocumented features, I agree... you can't call Microsoft on them. If they ever change the fundamental code behind the UPDATE statement so the "Quirky Update" doesn't work anymore, then we'll all have to resort to a Cursor, While Loop, or a CLR. Hopefully MS will have a properly working SUM() windowing function by then, as well. Heh... the good part is... they're done modifying 2k and 2k5 so I don't believe we have anything to worry about there.

On your item 2, agreed... I don't use any case sensitivity except by column so I tend to overlook the folks that have an entire server set to be case sensitive. Hmmmm... perhaps that would make a decent "Friday Poll"... "Question 1. Do you have any case sensitive servers? Question 2. Either way, please discuss why or why not." I think it would be a great discussion.

Again, even though we tend to disagree on a couple of subjects, I always appreciate the very thoughtful manner in which you discuss those things. Thanks, Hugo.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #816463
Posted Tuesday, November 10, 2009 7:26 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 3:22 PM
Points: 5,925, Visits: 8,174
Thanks Jeff.

I also found a TEMPDB (fully upper case) and a Master when I ran the code to create the million row table. If you have a chance, you might want to change that.

I always run on case sensitive servers so that I know for sure all code I write will work on all servers. (Okay, in theory I could mess up by using object names that only differ in case, but I am not THAT thick).

I don't think cursor, WHILE loop, CLR and RBAR are the only documented and guaranteed alternatives. If you have the SQL Server MVP Deep Dives book, you'll find it in my article on set-based iteration (chapter 4). If you don't have it, shame on you! All royalties go to a great charity so you ought to have at least three copies! Anyway, in this case you'll have to wait until I find the time to adapt my method to your test data, test its performance, and write a longer reply.



Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis
Post #816512
Posted Tuesday, November 10, 2009 7:48 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, May 24, 2014 7:58 PM
Points: 219, Visits: 824
These are incredibly smart techniques, but after playing with such approaches I decided that I don't want to put it into production - they are just a little bit too complex and they do not look robust to me. In my book this is one of those cases where denormalization allows for much faster and significantly simpler solutions:

[url=http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/01/23/denormalizing-to-enforce-business-rules-running-totals.aspx][/url]

BTW, Adam Machanic recommends CLR cursors in such cases...
Post #816542
Posted Tuesday, November 10, 2009 8:36 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Thursday, March 6, 2014 1:05 PM
Points: 1,334, Visits: 3,068
Alexander Kuznetsov-291390 (11/10/2009)
I decided that I don't want to put it into production - they are just a little bit too complex and they do not look robust to me. In my book this is one of those cases where denormalization allows for much faster and significantly simpler solutions


Probably a very smart decision in the long run Alex. I have been in that situation many times before where I discover something really kind of neat in SQl Server, but after some serious deliberation and some serious testing as well, I finally realized that discretion is the better part of valor and decided to just leave it out of my production environment.


"Technology is a weird thing. It brings you great gifts with one hand, and it stabs you in the back with the other. ..."
Post #816599
Posted Tuesday, November 10, 2009 9:22 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:58 PM
Points: 36,760, Visits: 31,215
Alexander Kuznetsov-291390 (11/10/2009)
These are incredibly smart techniques, but after playing with such approaches I decided that I don't want to put it into production - they are just a little bit too complex and they do not look robust to me. In my book this is one of those cases where denormalization allows for much faster and significantly simpler solutions:

[url=http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/01/23/denormalizing-to-enforce-business-rules-running-totals.aspx][/url]

BTW, Adam Machanic recommends CLR cursors in such cases...


First, thanks for taking the time to stop by and post your thoughts, Alexander. I really appreciate it.

I'm not sure why you find the "Quirky Update" a bit too complex nor why you don't think they look robust, but that's OK. Your good method will nicely support ongoing running balance updates on insertion of new data, albeit, in a RBAR manner. I'll add some performance testing using your good method to my list of things to do.

To be sure, your method is very clever... I like it. I just think it will be a bit slow because of it's RBAR nature.

And, yes, I know Adam recommends using a cursor for this... so do a lot of other good folks. If you don't trust the "Quirky Update" you can do as I suggested in the article... use verification code to verify it worked correctly or do like Adam and others suggest.... use a cursor and tolerate the performance and resource usage hit on large batches.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #816630
Posted Tuesday, November 10, 2009 9:33 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:58 PM
Points: 36,760, Visits: 31,215
talltop-969015 (11/10/2009)
Alexander Kuznetsov-291390 (11/10/2009)
I decided that I don't want to put it into production - they are just a little bit too complex and they do not look robust to me. In my book this is one of those cases where denormalization allows for much faster and significantly simpler solutions


Probably a very smart decision in the long run Alex. I have been in that situation many times before where I discover something really kind of neat in SQl Server, but after some serious deliberation and some serious testing as well, I finally realized that discretion is the better part of valor and decided to just leave it out of my production environment.


Yep... that's why I included the Cursor code. If you don't trust in the "Quirky Update" (even though no one has been able to break a properly written one since the first article came out), then do as I said in the article.... use a Cursor, CLR, or some other method you deem satisfactory. I do, however, believe that you're missing out on a very powerful tool. Running totals aren't the only thing this method can be used for.

Either way, thanks for stopping by and taking the time to write a bit of feedback. I appreciate it.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #816642
Posted Tuesday, November 10, 2009 9:42 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Saturday, May 24, 2014 7:58 PM
Points: 219, Visits: 824
Jeff Moden (11/10/2009)
Alexander Kuznetsov-291390 (11/10/2009)
These are incredibly smart techniques, but after playing with such approaches I decided that I don't want to put it into production - they are just a little bit too complex and they do not look robust to me. In my book this is one of those cases where denormalization allows for much faster and significantly simpler solutions:

[url=http://sqlblog.com/blogs/alexander_kuznetsov/archive/2009/01/23/denormalizing-to-enforce-business-rules-running-totals.aspx][/url]

BTW, Adam Machanic recommends CLR cursors in such cases...


First, thanks for taking the time to stop by and post your thoughts, Alexander. I really appreciate it.

I'm not sure why you find the "Quirky Update" a bit too complex nor why you don't think the look robust, but that's OK. Your good method will nicely support ongoing running balance updates on insertion of new data, albeit, in a RBAR manner. I'll add some performance testing using your good method to my list of things to do.

To be sure, your method is very clever... I like it. I just think it will be a bit slow because of it's RBAR nature.



Jeff, I am not sure how RBAR is relevant to denormalization, yet when the running totals are stored right in your row, you don't need any joins, any cursors, anything - selects are as fast as it goes. For inserts, you retrieve only one previous row, so your inserts do not slow down as your table grows...
Post #816651
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse