Solving the “Running Totals” Problem in SQL Server 2005 with Partitioned Tables

  • Comments posted to this topic are about the item Solving the “Running Totals” Problem in SQL Server 2005 with Partitioned Tables

  • I think Jeff hinted that he was going to eventually get to the analytics solution and far be it from me to steal his thunder, but I couldn't help wondering if either I have misunderstood the scale of the problem or the solutions displayed thus far have been grossly convoluted. From my understanding of the problem at its simplest, we would like to compute a running total (leaving any notion of partitioning aside for the moment). Surely all that is needed is something like this:

    select AccountId,

    Date,

    Amount,

    sum(Amount) over (partition by AccountId, order by Date) as "Running_Balance"

    from JBMTest

    order by AccountId, Date

    Now, partitioning. What has that got to do with anything? Again, from our simple perspective we don't care. Let our optimizer handle such details. As the query stands, we're looking to cover the entire table anyway. Given that our partitions are based on date, if we restrict to one year, we touch only one partition - but that is not something we need to explicitly state in a hint or some such construct, surely.

  • Not having access to a developer or Enterprise version of SQL Server I find this topic hard to go into myself. Some of the SQL features used are only available on those versions for those wondering why I bring it up. Still I did quickly read trough the essential parts that make up the solution and the path followed to get there and got a bit of a bad feeling after reading it.

    Searching for working methods is one thing, but what bugs me in this case is that the final output depends on query hints like OPTION (MAXDOP 1) that technically should not affect the output at all. The solution is relying on SQL Server implementation details that are unspecified by microsoft and are likely to change over time.

    Even when implementation details do not change, maintainability is at risk because the technique breaks with the very useful concept of SQL itself. You ask a question and specify what information you want as output, in what order. It is then up to the database to deliver according to your specifications as best as it can, isolating you from the details on how it does it.

    Imagine using the described technique and 4 years from now you move to a 4 processor, 12 core system and solid state disks. There turns out to be unexpected low performance and an expert is hired to see what is wrong. First thing that would get the boot would be the OPTION (MAXDOP 1). It would speed the result significantly and since such a hint SHOULD NEVER affect output it is unlikely to be noticed right away.

    You see where I am going with this....it feels like playing dice with your data!

  • John Walker (1/27/2009)


    I think Jeff hinted that he was going to eventually get to the analytics solution and far be it from me to steal his thunder, but I couldn't help wondering if either I have misunderstood the scale of the problem or the solutions displayed thus far have been grossly convoluted. From my understanding of the problem at its simplest, we would like to compute a running total (leaving any notion of partitioning aside for the moment). Surely all that is needed is something like this:

    select AccountId,

    Date,

    Amount,

    sum(Amount) over (partition by AccountId, order by Date) as "Running_Balance"

    from JBMTest

    order by AccountId, Date

    Now, partitioning. What has that got to do with anything? Again, from our simple perspective we don't care. Let our optimizer handle such details. As the query stands, we're looking to cover the entire table anyway. Given that our partitions are based on date, if we restrict to one year, we touch only one partition - but that is not something we need to explicitly state in a hint or some such construct, surely.

    Unfortunately, at this time, the window functions in SQL Server are limited and won't give you the answers that you need to answer the running total question. The row_number() function can be used in SQL Server 2005/2008 to handle the ranking problem, but that still leaves running balanaces. Until Microsoft fully implements the windowing functions in SQL Server, and you are looking to implement proformant code to meet this need, you will need to look at methods like this that utilize the database engine to its fullest.

    In addition, this is but an example that happens to use date for the partitioning. What if the partioning were done based on account ranges, for example, because the application design merited such a design. Please remember, the purpose of the article was to demonstrate a technique that could be used to help solve a problem.

  • peter (1/27/2009)


    Not having access to a developer or Enterprise version of SQL Server I find this topic hard to go into myself. Some of the SQL features used are only available on those versions for those wondering why I bring it up. Still I did quickly read trough the essential parts that make up the solution and the path followed to get there and got a bit of a bad feeling after reading it.

    Searching for working methods is one thing, but what bugs me in this case is that the final output depends on query hints like OPTION (MAXDOP 1) that technically should not affect the output at all. The solution is relying on SQL Server implementation details that are unspecified by microsoft and are likely to change over time.

    Even when implementation details do not change, maintainability is at risk because the technique breaks with the very useful concept of SQL itself. You ask a question and specify what information you want as output, in what order. It is then up to the database to deliver according to your specifications as best as it can, isolating you from the details on how it does it.

    Imagine using the described technique and 4 years from now you move to a 4 processor, 12 core system and solid state disks. There turns out to be unexpected low performance and an expert is hired to see what is wrong. First thing that would get the boot would be the OPTION (MAXDOP 1). It would speed the result significantly and since such a hint SHOULD NEVER affect output it is unlikely to be noticed right away.

    You see where I am going with this....it feels like playing dice with your data!

    I understand your concern, however, part of it can be solved by one simple thing and that is documentation. If this technique is used in a production system, the use of the OPTION (MAXDOP 1) should be well commented so that someone else coming along later would understand why it was being used and would not remove it, thereby breaking the process.

    I wish I had a true single processor system (no hyper threading or multiple cores) to have tested this on to see how it worked. Would I have thought that this worked without MAXDOP 1 in that environment? If it did, what would my reaction have been when I moved it to multi-processor, multi-core environment and it suddenly failed (remember, I did a final test in such an environment)?

    Maybe someone out there has a true single processor system and access to SQL Server 2005 Developer Edition and they will take the code from this article and run these tests and let us know the results of that testing.

  • I wish I had a true single processor system (no hyper threading or multiple cores) to have tested this on to see how it worked. Would I have thought that this worked without MAXDOP 1 in that environment? If it did, what would my reaction have been when I moved it to multi-processor, multi-core environment and it suddenly failed (remember, I did a final test in such an environment)?

    Oh but you do! Go into your BIOS and turn that stuff off!:hehe:

  • Smendle (1/27/2009)


    I wish I had a true single processor system (no hyper threading or multiple cores) to have tested this on to see how it worked. Would I have thought that this worked without MAXDOP 1 in that environment? If it did, what would my reaction have been when I moved it to multi-processor, multi-core environment and it suddenly failed (remember, I did a final test in such an environment)?

    Oh but you do! Go into your BIOS and turn that stuff off!:hehe:

    Maybe, but I'm a software geek, not a hardware geek. I'm not exactly sure how to go about doing that, and I'd be worried that I'd break something (and I'm not joking on that).

  • I liked your article ( a lot ) and it seems that you put a lot of effort into it.

    My view of running totals is simple. It is a CLIENT-SIDE responsibility.

    Although it may provide you with an alternative to deliver to the developers what they may want out of the DB

    IMHO looping through an ordered result set in the client is by far more scalable and "faster" end-to-end from the User point of view.

    Just my $0.02


    * Noel

  • noeld (1/27/2009)


    I liked your article ( a lot ) and it seems that you put a lot of effort into it.

    My view of running totals is simple. It is a CLIENT-SIDE responsibility.

    Although it may provide you with an alternative to deliver to the developers what they may want out of the DB

    IMHO looping through an ordered result set in the client is by far more scalable and "faster" end-to-end from the User point of view.

    Just my $0.02

    I'll agree, which is why I had that in the disclaimer at the beginning of the article. What I was really looking at was a process to solve a problem. As I have found many times, prinicples are more important than details. The running total problem with partitioned tables offered a problem and hopefully this article provides a principle that can be applied to similiar problems that may not be a client-side responsibility.

    An accounting professor once told me he rather teach 10 principles rather than 10 detailed tasks. With 10 principles you complete 100 tasks.

  • I received a private message with a question regarding my article that I can't answer, so I thought I'd put it here with the hopes that some of the gurus (Jeff, Gail, Grant, Gus, et al) out there may be able to help out.

    Why is the CPU time greater than the elapsed time

    CPU time = 5063 ms, elapsed time = 3309 ms.

  • That can happen on multi-processor systems that SQL is allowed to use.

  • noeld (1/27/2009)


    I liked your article ( a lot ) and it seems that you put a lot of effort into it.

    And I'm forgetting my manners. Thank you, it makes me feel good making the effort definitely worthwhile.

  • Apologies, I realise that my previous post represented something of a Utopian ideal 🙂 With that said however, I do think that whilst making do and mending is *good*, striving for a simpler way is not to be dismissed. Yes, perhaps Microsoft will eventually get around to properly implementing such functionality.

    Whilst writing this, I can see that a middle tier person just rode in to town.

    6 lines of code vs. potentially 1,000,000 rows of data over the network.

    When does the next craft to Utopia depart?

  • Smendle (1/27/2009)


    That can happen on multi-processor systems that SQL is allowed to use.

    I am sure this is in response to the following:

    I received a private message with a question regarding my article that I can't answer, so I thought I'd put it here with the hopes that some of the gurus (Jeff, Gail, Grant, Gus, et al) out there may be able to help out.

    Why is the CPU time greater than the elapsed time

    CPU time = 5063 ms, elapsed time = 3309 ms.

    Thinking about it, that actually makes sense and isn't something I'd have come up with on my own, thanks. Do any of the other gurus out there also agree?

  • Lynn Pettis (1/27/2009)


    Smendle (1/27/2009)


    That can happen on multi-processor systems that SQL is allowed to use.

    I am sure this is in response to the following:

    I received a private message with a question regarding my article that I can't answer, so I thought I'd put it here with the hopes that some of the gurus (Jeff, Gail, Grant, Gus, et al) out there may be able to help out.

    Why is the CPU time greater than the elapsed time

    CPU time = 5063 ms, elapsed time = 3309 ms.

    Thinking about it, that actually makes sense and isn't something I'd have come up with on my own, thanks. Do any of the other gurus out there also agree?

    I concur. What happens if you add MAXDOP = 1 to that particular query?

    FWIW, in a previous life MAXDOP was used as execution plans differed wildly on (single CPU, pre-hyper thread) Test kit than when they ran on the multi-CPU Live kit!

    Nice article btw 😉

    Dave Jackson


    http://glossopian.co.uk/
    "I don't know what I don't know."

Viewing 15 posts - 1 through 15 (of 40 total)

You must be logged in to reply to this topic. Login to reply