• telastruct (12/30/2010)


    Thanks, guys. I apologize if my "actually" seemed snitty- I see how it could be taken that way after re-reading it. I guess I should have followed it up with the question: If the replace method "can clean the 13 million rows in about 25 minutes" (about 2 minutes for each million rows), how is that faster than 2 minutes for about 4 million rows? I'm not a DB expert, and am sure there's probably some other factor I'm missing...?

    Even if it is not as efficient, I do like the way the other solution is packaged- it seems like the replace method would be pretty easy to incorporate.

    Thank you again for your thoughts.

    Understood and thanks for the feedback.

    The real key is the two different rowcounts on 2 different tables on 2 different machines by two different people using two different methods that haven't been fully posted. For example, are they doing an UPDATE, piping to a new table, or displaying on the screen? What's the status of fragmentation for both tables? How many indexes are being updated? Are the systems using the same version? Are the systems using the same capacity pipe and hard drives? What is the wait-state of the two systems? Are they both "dead quiet" or is one supporting a world wide heavy hit Web site with lots of reporting going on?

    There's something else that folks need to understand... each system has a "tipping point" where UPDATEs are concerned. For example... on one system it may take only 10 seconds to update a million rows and, likewise, only 20 seconds to update 2 million rows and only 30 seconds to update 3 million rows. But, on that same system with the same table and index structures using the exact same query, it may suddenly take 4 hours to update only 4 million rows. Everything else being the same, another system may be able to handle 6 million rows without reaching the "tipping point".

    If you want to do a comparison, it has to be on the same table and the same data or at least data with the same level of randomization. You just can't tell what's what with all the other variables currenlty in play.

    I'll also admit that I've seen certain memory only scalar functions beat other methods even when they have a WHILE loop in it. But, we won't actually know until at least one person actually tests both methods on the same data on the same machine. I was hoping you'd be that person so I didn't actually have to be "the one" yet again. 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)