Adding new data file to tempdb, leads to data incorrection

  • In one of the tempdb blog , it is stated that sometimes when we add a new file to tempdb to improve the performane it might lead to data incorrection (in context of data sorting or order) what i was thinking here is it could happen where multiple CPUs (multiple threads) are handling that particuler query (parallelism ) and combine the result at the end (not in order). am i on thr right page ?

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Depends what you mean by 'not in order'

    Neither parallelism, nor multiple tempDB files are going to result in a query with an order by returning data out of order. Any time there's a query without an order by, no order is ever guaranteed and hence no order should ever be assumed, no matter what the server setup is.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • OK thanks for explanation . can you please explain the phrase

    We Added a Tempdb Data File, and Data Was Wrong

    from http://www.brentozar.com/archive/2011/08/tempdb-multiplefiles-sort/

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • The data wasn't wrong, what's wrong is the assumption that the data would be ordered when no order was specified.

    That whole blog post could be rewritten into a fairly good demonstration of why assuming an order when an order by is not used is false. Setting it up as multiple temp db files causes a problem is pretty misleading IMO.

    "a high severity production incident was caused by adding an additional data file to the tempdb database." -> "adding an additional data file to the tempdb database caused a false assumption in the application to become evident when it caused a high severity production incident."

    That's more like it 🙂

  • Bhuvnesh (12/20/2012)


    OK thanks for explanation . can you please explain the phrase

    We Added a Tempdb Data File, and Data Was Wrong

    from http://www.brentozar.com/archive/2011/08/tempdb-multiplefiles-sort/

    Why don't you read the blog post? It was explained in there.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The order of the data could have changed without adding a file to tempdb, so that problem could have started happening anytime.

    You should always have an ORDER BY unless you are absolutely sure that the order does not matter.

    It's very common to assume that because you see the data from a query returned in the order you wanted that it will continue to be returned in that order in the future.

    It's a common wrong assumption that when you select from a table the data will be returned in the order of the clustered index.

  • Kendra's example was purely to show unintended consequences of making seemingly harmless changes to databases. Despite appearances, there is absolutely nothing that she's done that demonstrates that having multiple temp files affects the order in which data is imported. She did, however, do a great job demonstrating an accident just waiting to happen.

    To wit, the REAL problem with the code (which she hasn't explained because that wasn't the objective of the article) is the addition of an IDENTITY column after the table is populated. The order of the IDENTITY values that will appear in such an "after the fact" manner is absolutely NOT guaranteed to be in what people think is the correct order according to the data present.

    And, YES!!! I have the proof in the form of code! I'm in the process of writing it up.

    Another set of unintended lessons that I believe you need to learn from her fine post is to expect the unexpected, knowing more about T-SQL before you decide "good enough" really is, and taking the time to make your code as bullet proof as possible.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I had read the article but couldnt able to extract the pure juice thats why i added a post here to have more and clear insight.

    Jeff Moden (12/20/2012)


    I believe you need to learn from her fine post is to expect the unexpected, knowing more about T-SQL before you decide "good enough" really is, and taking the time to make your code as bullet proof as possible.

    thanks jeff.. i loved the quote you used here 🙂 "Expect the unexpected".

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply