ShrinkFile on large database with TRUNCATEONLY

  • We have a large OLAP database, about 2.5 TB spread out over 3 data files on three different drives, and recently someone ran a query that created a table that continued to grow until the data files filled the available disk space (about 3 TB total - 1 TB per drive). Tonight I plan on running a full backup (it's in Simple mode) and running a ShrinkFile on all three files sequentially with TRUNCATEONLY just so it will remove the space after the last extent. Just to verify I'm not missing anything, does anyone see any issues with doing this on such a large database? Or is there any way to tell ahead of time how much space this will recover?

    Granted running a DB Shrink is one of those things you just don't do, but this is a one-time shot and unavoidable to get the file size back under control.

    Thanks.

  • if you use truncateonly the shrinkfile will not attempt to move any pages around so no fragmentation will be caused in this case.

    Therefore this is safe to do and worth a shot as a first attempt to see what you get back. However the chances are you will not get anything back in my experience. I do not know of a query to determine is the last extent is in use.

    I would advise doing the shrink in chunks rather than trying to get all the space back in one go.

    ---------------------------------------------------------------------

  • george sibbald (12/30/2013)


    if you use truncateonly the shrinkfile will not attempt to move any pages around so no fragmentation will be caused in this case.

    Therefore this is safe to do and worth a shot as a first attempt to see what you get back. However the chances are you will not get anything back in my experience. I do not know of a query to determine is the last extent is in use.

    I would advise doing the shrink in chunks rather than trying to get all the space back in one go.

    I've heard a few people suggest doing this in chunks, but how is that done? For example one file is 950Gigs with about 200 Gigs free, so would this be running something like this (given the TRUNCATEONLY option yields no results):

    DBCC SHRINKFILE (DATA1,900000) then DBCC SHRINKFILE (DATA1,850000) then DBCC SHRINKFILE (DATA1,800000) and so forth until I get it to the desired size? I'll definitely be monitoring the log files while this runs, and I assume I can gauge an appropriate size to use based on when the logs start growing too large.

    Sam

  • Yes, that is how you shrink in chunks, though I would go for smaller amounts.

    Shrinks do not creatE much log and as your DB is in simple mode it will be truncated anyway

    ---------------------------------------------------------------------

  • samalex (12/30/2013)


    We have a large OLAP database, about 2.5 TB spread out over 3 data files on three different drives, and recently someone ran a query that created a table that continued to grow until the data files filled the available disk space (about 3 TB total - 1 TB per drive). Tonight I plan on running a full backup (it's in Simple mode) and running a ShrinkFile on all three files sequentially with TRUNCATEONLY just so it will remove the space after the last extent. Just to verify I'm not missing anything, does anyone see any issues with doing this on such a large database? Or is there any way to tell ahead of time how much space this will recover?

    Granted running a DB Shrink is one of those things you just don't do, but this is a one-time shot and unavoidable to get the file size back under control.

    Thanks.

    I agree with what Mr. Sibbald stated. Shrink with TRUNCATEONLY probably isn't going to do much for you. I also agree that it should be the first shot because you just might get lucky especially if you catch it early enough.

    How big is the largest table on this database?

    I ask because, as you well know, you need to repair the 99% fragmentation that DBCC SHRINKFILE will produce if you use anything other than the TRUNCATEONLY option. Doing ALTER INDEXES with the REORGANIZE option will "repack" the leaf level of the indexes, which may be good enough to get you out of the woods and has the advantage of "repacking" the pages "in place" without additional growth of the DB especially if you use the "SORT IN TEMDB" option. It does have a bit of a price attached to it, though. It does NOT "repack" the B-Tree levels of any index and there will also be serious fragmentation there, as well. If your database is used "one row at a time", that might be OK. I seriously doubt that to be the truth on any database, though.

    For non-Clustered Indexes, there are two different methods that I know of to keep the database from growing during the REBUILD... either DISABLE it (which drops the related pages without dropping the definition) and then reenable it (which causes a REBUILD) or drop the index, wait a while for the pages to actually deallocate, and then REBUILD it. Of course, there will probably be issues with FKs.

    For Clustered Indexes, unless the table is less than 128 extents (8 MB... really tiny in today's world), it keeps a copy of the old index until the new one is built and only then does it deallocate the pages occupied by the old index (and that includes HEAPs, as well). When REBUILDing a Clustered Index, it's going to allocate roughly the same number of new pages as there are in the old for all the data plus the B-Tree.

    For "Monster" tables, there is a work around although it's a bit complicated. REORGANIZE the indexes on the original table to "repack" the leaf/data level. Use something like sp_SpaceUsed to find out how much room the Data + Indexes take up. Create a new FileGroup and a new File within that group with just a bit more as what sp_SpaceUsed told you and with a normal growth depending on your expected/established growth patterns. Rebuild the Clustered index on the old table with an ON clause that points to the new FileGroup. This will rebuild the index but it will cause the old pages that will be deallocated on the old FileGroup and the new pages to form on the new FileGroup (effectively moving the data) instead of both being present in one FileGroup. Do this with each table (you should automate it if you have a lot of tables). You can then create the non-clustered indexes on the new FileGroup almost without any growth. Once the original FileGroup is empty, drop that file group to recover the space on the next ghost cleanup pass.

    Of course, that will initially take the same amount of room as REBUILDING the indexes in place but you'll be able to recover that.

    This isn't the first time I've seen a problem like this and I'm taking it to heart. For my larger systems, I'm in the process of moving many of my larger tables to their own FileGroups (and partitioning some of them, as well) and groups of smaller tables to other FileGroups. My intent is to leave the PRIMARY FileGroup almost empty because such mistakes as accidently building a monster table with a runaway query will almost always happen on the PRIMARY FileGroup. If it doesn't have anything except the mistake in it, dropping the mistake table and shrinking the file will cause me virtually no work that way.

    As people unwittingly build new tables on the PRIMARY FileGroup, I'll make a sweep once a month and move those tables to other FileGroups as I've previously described.

    It's not a panacea but it seems like it will prevent most such problems in the future.

    For those people that say "Wait a minute... that disk space is dedicated to the server so why not just leave it alone", there might be two problems with that. The first one is that all of the space has been taken up by a single database which means none of the other databases can grow. The second issue is something I hope you never have to deal with and that's a DR restore on a smaller machine because everything else failed despite your best intentions. When such a thing happens, you want the smallest footprint possible. Large footprints can also make restores to test servers (which are usually smaller than production servers) virtually impossible.

    To coin a phrase, "Right Sizing Rules". 🙂

    Hopefully, Gail Shaw or one of the other MCMs will show up to make any additional suggestions or corrections to what I've said above.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I agree with what Mr. Sibbald stated.

    please, call me george.........

    ---------------------------------------------------------------------

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply