Space Required For Index Rebuilds

  • While working with multiple FileGroup/File combos for table partitioning and setting up for PieceMeal restores, I ran into the following problem as quoted from the MS article at http://technet.microsoft.com/en-us/library/ms179542.aspx. The embolding is mine.

    When a new index structure is created, disk space for both the old (source) and new (target) structures [font="Arial Black"]is required in their appropriate files and filegroups[/font]. The old structure is not deallocated until the index creation transaction commits.

    This problem will easily cause the given File to double in size even if SORT IN TEMPDB = ON. Of course, when setting a File for something like an Audit table to READ ONLY and you have 1 of these files for each of more than four years, the amount of wasted disk space in each of those files really starts to add up. Essentially, you're doubling the disk space required for a huge table.

    Now, I've figured out a way to keep that from happening each month when I go to set the previous month's file to READ ONLY (it would be a problem even if I didn't set it to READ ONLY, BTW), but it's complex, offline, and reasonably slow because it requires that you copy the data out of the file to another FileGroup/File, SWITCH the original data to another table on the same File/FileGroup, dropping the SWITCH table, shrinking the bloated file I just moved the data out of to 0, calculate the size of the data and indexes of the copied data, resize the previously bloated file using that number so as to avoid fragmentation caused by miniscule growth factors, and then reapply the partitioned CI to the moved data to overcome the problem of the original index staying active on the file group until the new index is created.

    [font="Arial Black"]My question is[/font], does anyone have a trick up their sleeve that would make rebuilding a clustered index on a single file partition any easier and without the doubling in size due to the delayed deallocation problem cited at the URL I posted at the beginning of this post? Or am I pretty much stuck with the relatively complex method I've developed for this?

    As a bit of a sidebar on this subject, I'd love to ask MS "What were you thinking?" Peter Norton got this type of stuff down pat literally decades ago.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Not the solution, but the reason.

    When rebuilding an index, the old index is only dropped once the new one is complete. It's for ease of rollback (can just drop the part-built index structure) and performance (SQL reads the old index to create the new one, alleviating the need to do a sort or read more data than necessary or re-apply any filter in the case of a filtered index), also for online rebuilds to allow queries to read and use the index during the rebuild.

    This isn't the sort space, which is what goes into TempDB when sort in TempDB is on.

    You could try rebuild twice then shrink with truncate_only. Otherwise REORGANIZE which doesn't recreate the index and hence has no such space requirements

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Update... I may have answered my own question but I haven't tried it, yet. I had previously rejected the idea of using just DBCC SHRINKFILE because the TRUNCATEONLY option doesn't seem to work well on the files of the very active PRIMARY file group and, of course, not using the TRUNCATEONLY option causes massive fragmentation. Since the files that I'd be rebuilding the Clustered Index on would not have any modifications or new inserts in progress, it might just work if I apply it immediately after each index rebuild on the small partition file.

    Still, if anyone has a better idea, I'd love to hear it. It would be, as my boss says, a "spec-hackular" idea.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh... great minds think alike, Gail. You posted while I was writing. Thanks for the confirmation on the TRUNCATEONLY thing. I'll give it a try.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (12/8/2013)


    Since the files that I'd be rebuilding the Clustered Index on would not have any modifications or new inserts in progress, it might just work if I apply it immediately after each index rebuild on the small partition file.

    It won't, because the new index will be at the end of the file and the empty space at the beginning. Hence why I suggested rebuild *twice*, then the second rebuild should, hopefully reuse the space that the index was initially in, leaving free space at the end of the file which can be truncated.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ah. Got it. Thanks, Gail.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply