Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5

  • Sir Slicendice (2/19/2008)


    Well, you sure can age these: all line items for the 10 accounts that have the longest period without a transaction and that have a positive current balance:

    select rttl.*, accts.age, accts.[Current Total]

    from (select top(10) AccountID, datediff(month, max(Date), '2009-06-01') as age, sum(Amount) as [Current Total]

    from JBMTest

    group by AccountID

    order by AccountID) as accts

    cross apply slowcteTVF(accts.AccountID) as rttl

    where [Current Total] > 0.0

    and age > 1

    (The dates in my test data work best around 2009-06-01; luck of the random population process.....)

    Anyway, it's easy to add criteria and sub-queries, etc, and full performance is maintained. (This query is sub-1 sec on my machine.)

    Why store the running totals when it is fast to compute them on demand? What you don't store can't hurt you....

    -frank

    I have to report 6.5 million people to the credit bureau, so I've got to export this data. =)

    Also, maybe I messed it up, but when I adapted this for use on a really large table, my 80 million row table...

    This has been running for 29 minutes on the initial load...and doing a LOT of I/O and not returning me any results.

    Christopher Ford

  • Jeff Moden (2/19/2008)


    I have fooled around with some UDF replacements such as calculating

    date ranges, .... and I've been amazed how fast this works on CLR.

    I'd be real interested in knowing what kind of date ranges you're calculating... in fact, just for fun, I wouldn't mind trying to beat the CLR's you've made (someone else would have to run the tests, though). Not for bragging rights, either. Up to now, folks like Matt, myself, and others have been able to beat all but RegEx replacement hands down. And, considering the apparent problems folks have had with CLR's and 64 bit installations, it would be nice to offer high speed alternatives to such problems as calculated date ranges using only T-SQL.

    Lordy, we should write a book... corroboration like this is damned fun. 🙂

    No rocketscience, just started with KISS priciple :hehe:

    I have no doubt Matt may come up with even better functions for these, and that would highlight the ROI of forums.

    I've also done some testing for the geocoding thread on the SSC forum.

    and I use it to accentuate to my developers that a webservice may be

    offline, overloaded, slow, ...

    Well, actualy SQLCLR is the proof of the statement :

    "Curiosity killed the cat" and a cat has 9 lives 😀

    You will need those when you're experimenting with SQLCLR

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Christopher Ford (2/19/2008)


    BTW, on a production test.

    I had to run Jeff's running total update because I had 80 million rows to group by 1,928,672 different accounts over a 7 year period.

    1.5 minute run time...

    Now...THAT...is priceless.

    ::Edit -- The clustered index was already created in the order I needed the update run, so I didn't factor that into the run time. ::

    And, so is the feedback. Thanks for that, Chris! That's awesome...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • >>Why store the running totals when it is fast to compute them on demand? What you don't store can't hurt you....

    Yes, in fact, it can. We store precalculated data to avoid the overhead (however slight you may perceive it to be) of doing the work at run-time. I have seen systems brought to their knees by just this sort of thing, where a single point lookup or even small scan of a preaggregated value opens up all kinds of headroom on the server during heavy access periods.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Choosing whether or not to store some aggregate value (anything that can be calculated based on the fundamental data) depends on a lot of factors -- statutory requirements, how often the value is needed, impact of corrections/revisions, reporting/query use versus transaction use, opportunities for segregating/indexing, etc. Even the performance impact varies: wider tables may save calculations but may still be a net loss for performance depending on processor versus disk subsystem performance, etc....

    Lots of issues, so no clear reason to always store (or always compute) an aggregate like this....

    -frank


    The End.

  • Agreed on all counts Frank. There is almost never a one-size-fits-all rule or statement. Which is one of the primary reasons that experience is so important in this biz - the more things you have witnessed and learned from the more likely you are to notice a particular situation and know what is optimal for it.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Hey, anyone know if there is a way to extract out all of the 200+ posts for this thread into a single document? I would very much like to have this entire body of knowledge in a file indexed on my computer!!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (2/19/2008)


    Hey, anyone know if there is a way to extract out all of the 200+ posts for this thread into a single document? I would very much like to have this entire body of knowledge in a file indexed on my computer!!

    Here you go...

    --

    Jeff, there you go, you got your book, it's almost 50 pages. 😀

    Christopher Ford

  • Very cool, Chris... how the heck did you do it?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/19/2008)


    Very cool, Chris... how the heck did you do it?

    1. Click Topic Options in the blue title bar above top most post on page.

    2. Click Print Topic

    3. Have Adobe Acrobat Standard or Professional installed to print to PDF Printer.

    4. Post for your enjoyment.

    I was going to write a SQL CLR TVF to parse the HTML of the whole topic, then split the posts out into neat little entries and apply full text indexing to the table to make it searchable.

    Then export the rows into a new HTML document with the tags for searching into Microsoft Word....

    Then I found the Print Topic button and it smashed all my dreams for that project. :hehe:

    Christopher Ford

  • Awesome. And very simple, to boot! Thanks Chris! Neat redirection trick.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The TVF would run faster and be far more portable than relying on pushing the button; the button could move or be removed at any point..... But if you are doing a TVF, should definitely do it with the XML extensions in native SQL.

    -frank


    The End.

  • Sir Slicendice (2/19/2008)


    The TVF would run faster and be far more portable than relying on pushing the button; the button could move or be removed at any point..... But if you are doing a TVF, should definitely do it with the XML extensions in native SQL.

    -frank

    This is true...the button "could" be removed. Or it could be upgraded.

    Ofcourse, with the XML/Web Scrape...someone could upgrade the internet...and then that would be broken too.

    It's too bad that you can't get that silly RSS button to back fill the posts on the discussion thread.

    Christopher Ford

  • Jeff, Thanks for keeping me up too late again. As it seems I say often, I thought I was an expert then I came to SSC to find I was wrong.

    Oh and for printing to PDF you can also get PDF995 free an it prints to PDF as well. Sure there's an annoying add, but still not bad.

  • Heh... it's only fitting, Jack. I stayed up too late to write it 😛

    Thanks for both the feedback and the tip on the PDF stuff... I really appreciate it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 196 through 210 (of 250 total)

You must be logged in to reply to this topic. Login to reply