Need some Logic Help on some row by row process

  • psst - Jack? (secret time)

    Jeff's solution is referred by MS tech-heads (some of their bloggers) as an "inner loop" technique. Meaning - you have to touch each record one at a time a "accumulate" the values as you pass through them (yes - just like the variable would act in a "real" loop). The @var being used is a mechanism to "pass info" between consecutive records.

    And the "consecutive" part is where it gets tricky - and that's where the clustered index comes into play (since you need to control the order of the rows during processing - which is usually a big no-no in set-based processing).

    So far - it smells and acts like a "regular" loop with a cursor, right? The difference here seems to be that the row updates don't require individual commits (like a "classic" cursor would). The update for all intents and purposes performs at the same speed as a "regular" update. So - you manage to con SQL Server into giving you the ordering benefits of a cursor, while maintaining the performance of a "traditional" set-based solution.

    In other words - it's not exactly set-based, due to the ordering and looping. But it performs at the same level as something good "set-based", so - who really cares that it's technically not?

    "Just ignore the man behind the curtain...."

    (It actually in my mind shows the way a cursor SHOULD work. The "individual row update" part just kills me - there's no justification why the updates couldn't be kept in memory, and "flushed" all at once. Heck - even procedural languages allow for that, so why doesn't SQL Server?)

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Thanks for the explanation Matt. This is basically how I thought it would work, but trying to understand it sets my head spinning because it is so different from how a "normal" update would work.

    I just had an interesting thought, how would an update trigger handle the "inner loop", does it see it as a single update or individual updates?

  • Heh... ya beat me to it... thanks, Matt.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jack Corbett (3/4/2008)


    Thanks for the explanation Matt. This is basically how I thought it would work, but trying to understand it sets my head spinning because it is so different from how a "normal" update would work.

    I just had an interesting thought, how would an update trigger handle the "inner loop", does it see it as a single update or individual updates?

    It's a single update operation and is treated as such...

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • That would have been my assumption, but it never hurts to ask.

    Thanks,

  • Here's another approach that I use sometimes. Pack all the data you want into a fixed length string, aggregating on your desired ID then unpack if from the string.

    select

    Widget = convert(int, substring(AggData, 1, 11)) ,

    DateValue = convert(datetime, substring(AggData, 12,24))

    from (

    select AggData = max(convert(char(11), Widget) + convert(char(24), DateValue, 121))

    from MyTable

    group by Widget

    ) t

    I didn't actually test this sample since I don't have access to SQL Server at home, but it should work with minimal tweeking.

  • Jeff, Matt,

    that has just blown my socks off! Have you any links to further reading on this 'inner loop' technique.

  • Heh... yeah... I know a really good one 😉

    http://www.sqlservercentral.com/articles/Advanced+Querying/61716/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I just got into work to try it, and it looks like I didn't understand the question. Sorry about that.

  • OK, the same technigue of packing aggregate into a string, but using a correlated subquery.

    select

    Widget,

    DateValue

    from (

    select

    Widget,

    DateValue,

    prev=(

    select max(convert(varchar, DateValue, 121) + convert(char(11), Widget))

    from #mytable t2

    where t2.DateValue < t.DateValue

    )

    from #mytable t

    ) t

    where prev is null

    or Widget <> convert(int, substring(prev,24,11))

  • Heh... at least you tried. Thanks for the feedback.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ralph Curry (3/5/2008)


    OK, the same technigue of packing aggregate into a string, but using a correlated subquery.

    select

    Widget,

    DateValue

    from (

    select

    Widget,

    DateValue,

    prev=(

    select max(convert(varchar, DateValue, 121) + convert(char(11), Widget))

    from #mytable t2

    where t2.DateValue < t.DateValue

    )

    from #mytable t

    ) t

    where prev is null

    or Widget <> convert(int, substring(prev,24,11))

    Oh... be careful... that's got a triangular join in it... please see the following for why that can be so very bad...

    http://www.sqlservercentral.com/articles/T-SQL/61539/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks, that was a good read. I knew that the correlated subquery caused it to be a RBAR approach, just packaged in a single query in case you are querying a system you don't have DDL rights on. I also assumed that if there was an index on (DateValue, Widget) (maybe DESC) it would do the max in the correlated subquery more efficiently than the brute-force triangle join.

    While I have your attention, how about this:

    select

    identity(int, 1, 1) seq_id,

    Widget,

    DateValue

    into #myNewTable

    from #mytable

    order by DateValue, Widget

    -- create clustered index in (seq_id)

    select t.*

    from #myNewTable t

    left join #myNewTable t2

    on t.seq_id = t2.seq_id + 1

    where t.seq_id = 1 or t.Widget <> t2.Widget

  • Thats one of the better ideas, Ralph. It takes 25 ms duration, 15 ms CPU time, and 214 logical reads.

    The update method I showed, as long as it looks, takes 15 ms duration, 15 ms CPU time, 2 writes, and only 155 logical reads.

    Either way, though, they're both lightning fast compared to any kind of a loop or correlated subquery...

    Nice job.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff, I was finally able to take a look at this, and was able to modify what you provided to generate my results.

    Thanks for your help.

    Dan

Viewing 15 posts - 16 through 30 (of 30 total)

You must be logged in to reply to this topic. Login to reply