Temp DB filling on simple update statement?

  • Hi,

    I have a pretty simple update statement:

    UPDATE eligibility

    SET primary_plan = 0

    FROM temp_eligibility e

    WHERE eligibility.eligibility_id = e.eligibility_id

    This statement is updating around 7 million rows. Problem is it is filling up my data volume (TempDB). It is eating up more than 330 Gigs of space!. Guessing it is due to the large number if indexes on this table?

    Is there a way I can minimize the Tempdb growth? Change isolation level maybe?

  • See if this helps, since you don't need anything from the temp_eligibility table other than the row match, there's no point in a join

    UPDATE eligibility

    SET primary_plan = 0

    WHERE EXISTS (SELECT 1 FROM temp_eligibility e WHERE eligibility.eligibility_id = e.eligibility_id

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Can you post the full create table definition of the table, including all the indexes?

    Also, do you have any triggers on the table? If so post those up too.

    Are you absolutely certain this update is causing the tempdb issues?

    Oh, wait!! You are using TWO DIFFERENT TABLES - one that is UPDATEd and then other that is a driver table for what to update. So the likely culprit here is a Cartesian join. If this is the case then Gail's recommended fix should stop that.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks Gail.

    Trying this now.

  • krypto69 (5/25/2016)


    Hi,

    I have a pretty simple update statement:

    UPDATE eligibility

    SET primary_plan = 0

    FROM temp_eligibility e

    WHERE eligibility.eligibility_id = e.eligibility_id

    This statement is updating around 7 million rows. Problem is it is filling up my data volume (TempDB). It is eating up more than 330 Gigs of space!. Guessing it is due to the large number if indexes on this table?

    Is there a way I can minimize the Tempdb growth? Change isolation level maybe?

    I've run into this several times in the last couple of decades. That is actually an illegal form of UPDATE that, as you're finding out, will sometimes eat the face off of your machine. You won't find that particular form of joined UPDATE anywhere in books online.

    Any time you have a joined update, you must either do like Gail did with a correlated subquery or you must include the target of the update in the FROM clause with the correct join or you end up in a situation not unlike "Halloweening" (think "CROSSJOIN" but on steroids). SQL Server has "Halloween" protection built into it but only if the form of the UPDATE is correct and the form you have is NOT correct because you have neither a correlated subquery nor is the target table in the FROM clause.

    The insidious part of all this is that the improperly formed UPDATE will sometimes run correctly and then when you least expect it, some tipping point is reached and it goes nuts.

    The correct form of the UPDATE that you're trying to do should be as follows...

    UPDATE tgt

    SET primary_plan = 0

    FROM dbo.eligibility tgt

    JOIN dbo.temp_eligibility e

    WHERE tgt.eligibility_id = e.eligibility_id

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • online indexing enable for this table , if you use SORT_IN_TEMPDB online indexing it will eat up your tempdb database.

  • GilaMonster (5/25/2016)


    See if this helps, since you don't need anything from the temp_eligibility table other than the row match, there's no point in a join

    UPDATE eligibility

    SET primary_plan = 0

    WHERE EXISTS (SELECT 1 FROM temp_eligibility e WHERE eligibility.eligibility_id = e.eligibility_id

    Hi Gail,

    Can you please clear my doubt. When you use 'exists' condition, it will just return true even if there is one row found where the two tables match on eligibility_id column.

    But if we go with that and since there are matching rows, it will update all the rows in 'eligibility' table but the OP needs the column updated on rows which are matching.

    Please guide me, I got really stumped on this basic SQL.

  • chandan_jha18 (6/2/2016)since there are matching rows, it will update all the rows in 'eligibility' table but the OP needs the column updated on rows which are matching.

    Will it really?

    UPDATE eligibility

    SET primary_plan = 0

    WHERE EXISTS (SELECT 1 FROM temp_eligibility e WHERE eligibility.eligibility_id = e.eligibility_id

    Why don't you test it, confirm that it's wrong (if it is wrong) and produce a correct version?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Depending on how many indexes we're talking about, you could disable them pretty easily, do your update, then initiate a rebuild afterwards.

  • UPDATE eligibility SET primary_plan = 0

    WHERE primary_plan <> 0

    AND EXISTS (

    SELECT 1

    FROM temp_eligibility e

    WHERE e.eligibility_id = eligibility.eligibility_id)

    Might eliminate some writes.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • If it's filling TEMPDB during an UPDATE, then it's probably performing a HASH MATCH join between the tables. Look at the estimated execution plan to confirm. To aboid a hash join, consider indexing ELIGIBILITY_ID on both the tables.

    Still, I wouldn't exepect a join between two tables with 7 million rows to fill 300+ GB of temp space, so as suggested earier, also consider if ELIGIBILITY_ID alone is the correct join key. It's possible you're performing a partial cartesean product join. Again, you can confirm this by looking for a fat pipe within the execution plan and confirming the number of rows it's returning from the join operation.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • GilaMonster (6/2/2016)


    chandan_jha18 (6/2/2016)since there are matching rows, it will update all the rows in 'eligibility' table but the OP needs the column updated on rows which are matching.

    Will it really?

    UPDATE eligibility

    SET primary_plan = 0

    WHERE EXISTS (SELECT 1 FROM temp_eligibility e WHERE eligibility.eligibility_id = e.eligibility_id

    Why don't you test it, confirm that it's wrong (if it is wrong) and produce a correct version?

    Never doubted what you wrote, not in many light years:-) I will test it in morning. May be I am somewhat confused at this point and tired too.

  • krypto69 (5/25/2016)


    Hi,

    I have a pretty simple update statement:

    So is it fixed now???

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    Well it is fixed. Thanks everyone. I did run it with Gail's suggestion and it used about 30G worth of temp db space.

    So for us, that was a big win! Our runs in testing produced over 330G of space.

    The underlying issue turned out to be triggers. But Gails suggestion helped allot.

  • krypto69 (6/2/2016)


    Hi Jeff,

    Well it is fixed. Thanks everyone. I did run it with Gail's suggestion and it used about 30G worth of temp db space.

    So for us, that was a big win! Our runs in testing produced over 330G of space.

    The underlying issue turned out to be triggers. But Gails suggestion helped allot.

    What were the triggers doing? Also, if you're still using 30GB of TempDB, you still have a problem very well worth looking into.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply