Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Solving the “Running Totals” Problem in SQL Server 2005 with Partitioned Tables


Solving the “Running Totals” Problem in SQL Server 2005 with Partitioned Tables

Author
Message
Lynn Pettis
Lynn Pettis
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24283 Visits: 37991
Comments posted to this topic are about the item Solving the “Running Totals” Problem in SQL Server 2005 with Partitioned Tables

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
John Walker-229141
John Walker-229141
Forum Newbie
Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)

Group: General Forum Members
Points: 7 Visits: 32
I think Jeff hinted that he was going to eventually get to the analytics solution and far be it from me to steal his thunder, but I couldn't help wondering if either I have misunderstood the scale of the problem or the solutions displayed thus far have been grossly convoluted. From my understanding of the problem at its simplest, we would like to compute a running total (leaving any notion of partitioning aside for the moment). Surely all that is needed is something like this:

select AccountId,
Date,
Amount,
sum(Amount) over (partition by AccountId, order by Date) as "Running_Balance"
from JBMTest
order by AccountId, Date

Now, partitioning. What has that got to do with anything? Again, from our simple perspective we don't care. Let our optimizer handle such details. As the query stands, we're looking to cover the entire table anyway. Given that our partitions are based on date, if we restrict to one year, we touch only one partition - but that is not something we need to explicitly state in a hint or some such construct, surely.
peter-757102
peter-757102
SSC-Addicted
SSC-Addicted (471 reputation)SSC-Addicted (471 reputation)SSC-Addicted (471 reputation)SSC-Addicted (471 reputation)SSC-Addicted (471 reputation)SSC-Addicted (471 reputation)SSC-Addicted (471 reputation)SSC-Addicted (471 reputation)

Group: General Forum Members
Points: 471 Visits: 2549
Not having access to a developer or Enterprise version of SQL Server I find this topic hard to go into myself. Some of the SQL features used are only available on those versions for those wondering why I bring it up. Still I did quickly read trough the essential parts that make up the solution and the path followed to get there and got a bit of a bad feeling after reading it.

Searching for working methods is one thing, but what bugs me in this case is that the final output depends on query hints like OPTION (MAXDOP 1) that technically should not affect the output at all. The solution is relying on SQL Server implementation details that are unspecified by microsoft and are likely to change over time.

Even when implementation details do not change, maintainability is at risk because the technique breaks with the very useful concept of SQL itself. You ask a question and specify what information you want as output, in what order. It is then up to the database to deliver according to your specifications as best as it can, isolating you from the details on how it does it.

Imagine using the described technique and 4 years from now you move to a 4 processor, 12 core system and solid state disks. There turns out to be unexpected low performance and an expert is hired to see what is wrong. First thing that would get the boot would be the OPTION (MAXDOP 1). It would speed the result significantly and since such a hint SHOULD NEVER affect output it is unlikely to be noticed right away.

You see where I am going with this....it feels like playing dice with your data!
Lynn Pettis
Lynn Pettis
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24283 Visits: 37991
John Walker (1/27/2009)
I think Jeff hinted that he was going to eventually get to the analytics solution and far be it from me to steal his thunder, but I couldn't help wondering if either I have misunderstood the scale of the problem or the solutions displayed thus far have been grossly convoluted. From my understanding of the problem at its simplest, we would like to compute a running total (leaving any notion of partitioning aside for the moment). Surely all that is needed is something like this:

select AccountId,
Date,
Amount,
sum(Amount) over (partition by AccountId, order by Date) as "Running_Balance"
from JBMTest
order by AccountId, Date

Now, partitioning. What has that got to do with anything? Again, from our simple perspective we don't care. Let our optimizer handle such details. As the query stands, we're looking to cover the entire table anyway. Given that our partitions are based on date, if we restrict to one year, we touch only one partition - but that is not something we need to explicitly state in a hint or some such construct, surely.


Unfortunately, at this time, the window functions in SQL Server are limited and won't give you the answers that you need to answer the running total question. The row_number() function can be used in SQL Server 2005/2008 to handle the ranking problem, but that still leaves running balanaces. Until Microsoft fully implements the windowing functions in SQL Server, and you are looking to implement proformant code to meet this need, you will need to look at methods like this that utilize the database engine to its fullest.

In addition, this is but an example that happens to use date for the partitioning. What if the partioning were done based on account ranges, for example, because the application design merited such a design. Please remember, the purpose of the article was to demonstrate a technique that could be used to help solve a problem.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Lynn Pettis
Lynn Pettis
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24283 Visits: 37991
peter (1/27/2009)
Not having access to a developer or Enterprise version of SQL Server I find this topic hard to go into myself. Some of the SQL features used are only available on those versions for those wondering why I bring it up. Still I did quickly read trough the essential parts that make up the solution and the path followed to get there and got a bit of a bad feeling after reading it.

Searching for working methods is one thing, but what bugs me in this case is that the final output depends on query hints like OPTION (MAXDOP 1) that technically should not affect the output at all. The solution is relying on SQL Server implementation details that are unspecified by microsoft and are likely to change over time.

Even when implementation details do not change, maintainability is at risk because the technique breaks with the very useful concept of SQL itself. You ask a question and specify what information you want as output, in what order. It is then up to the database to deliver according to your specifications as best as it can, isolating you from the details on how it does it.

Imagine using the described technique and 4 years from now you move to a 4 processor, 12 core system and solid state disks. There turns out to be unexpected low performance and an expert is hired to see what is wrong. First thing that would get the boot would be the OPTION (MAXDOP 1). It would speed the result significantly and since such a hint SHOULD NEVER affect output it is unlikely to be noticed right away.

You see where I am going with this....it feels like playing dice with your data!


I understand your concern, however, part of it can be solved by one simple thing and that is documentation. If this technique is used in a production system, the use of the OPTION (MAXDOP 1) should be well commented so that someone else coming along later would understand why it was being used and would not remove it, thereby breaking the process.

I wish I had a true single processor system (no hyper threading or multiple cores) to have tested this on to see how it worked. Would I have thought that this worked without MAXDOP 1 in that environment? If it did, what would my reaction have been when I moved it to multi-processor, multi-core environment and it suddenly failed (remember, I did a final test in such an environment)?

Maybe someone out there has a true single processor system and access to SQL Server 2005 Developer Edition and they will take the code from this article and run these tests and let us know the results of that testing.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Smendle
Smendle
SSC-Enthusiastic
SSC-Enthusiastic (165 reputation)SSC-Enthusiastic (165 reputation)SSC-Enthusiastic (165 reputation)SSC-Enthusiastic (165 reputation)SSC-Enthusiastic (165 reputation)SSC-Enthusiastic (165 reputation)SSC-Enthusiastic (165 reputation)SSC-Enthusiastic (165 reputation)

Group: General Forum Members
Points: 165 Visits: 1065
I wish I had a true single processor system (no hyper threading or multiple cores) to have tested this on to see how it worked. Would I have thought that this worked without MAXDOP 1 in that environment? If it did, what would my reaction have been when I moved it to multi-processor, multi-core environment and it suddenly failed (remember, I did a final test in such an environment)?


Oh but you do! Go into your BIOS and turn that stuff off!Hehe
Lynn Pettis
Lynn Pettis
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24283 Visits: 37991
Smendle (1/27/2009)
I wish I had a true single processor system (no hyper threading or multiple cores) to have tested this on to see how it worked. Would I have thought that this worked without MAXDOP 1 in that environment? If it did, what would my reaction have been when I moved it to multi-processor, multi-core environment and it suddenly failed (remember, I did a final test in such an environment)?


Oh but you do! Go into your BIOS and turn that stuff off!Hehe


Maybe, but I'm a software geek, not a hardware geek. I'm not exactly sure how to go about doing that, and I'd be worried that I'd break something (and I'm not joking on that).

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
noeld
noeld
SSCertifiable
SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)SSCertifiable (6.3K reputation)

Group: General Forum Members
Points: 6336 Visits: 2048
I liked your article ( a lot ) and it seems that you put a lot of effort into it.

My view of running totals is simple. It is a CLIENT-SIDE responsibility.

Although it may provide you with an alternative to deliver to the developers what they may want out of the DB
IMHO looping through an ordered result set in the client is by far more scalable and "faster" end-to-end from the User point of view.

Just my $0.02


* Noel
Lynn Pettis
Lynn Pettis
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24283 Visits: 37991
noeld (1/27/2009)
I liked your article ( a lot ) and it seems that you put a lot of effort into it.

My view of running totals is simple. It is a CLIENT-SIDE responsibility.

Although it may provide you with an alternative to deliver to the developers what they may want out of the DB
IMHO looping through an ordered result set in the client is by far more scalable and "faster" end-to-end from the User point of view.

Just my $0.02



I'll agree, which is why I had that in the disclaimer at the beginning of the article. What I was really looking at was a process to solve a problem. As I have found many times, prinicples are more important than details. The running total problem with partitioned tables offered a problem and hopefully this article provides a principle that can be applied to similiar problems that may not be a client-side responsibility.

An accounting professor once told me he rather teach 10 principles rather than 10 detailed tasks. With 10 principles you complete 100 tasks.

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Lynn Pettis
Lynn Pettis
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24283 Visits: 37991
I received a private message with a question regarding my article that I can't answer, so I thought I'd put it here with the hopes that some of the gurus (Jeff, Gail, Grant, Gus, et al) out there may be able to help out.


Why is the CPU time greater than the elapsed time

CPU time = 5063 ms, elapsed time = 3309 ms.


Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search