After 2014 to 2017 Upgrade, table unused space growing massively.

  • I have a database that we recently upgraded from SQL 2014 to SQL 2017. The application against the data are working fine, but we started to notice database file growth.

    After investigating, many tables are exhibiting a behavior where the unused_space is growing massively compared to the time before the upgrade, and I cannot figure out why.

    Many of the tables receive millions of rows per day. One example, call it our A_S_A table, has a row count of over 350 Million rows, and before the 2017 upgrade, the daily "unused space" value was about 9.7 GB. It is currently, 6 days later, 55,293 GB and growing daily.

    I don't understand what's going on and what's different.

  • Fillfactor, collation, some other change?

  • is it a heap? (does it have a clustered index)

    if you process things nightly and delete data then the space can remain allocated but the data content is a lot less than the allocated space

    if you can't add a clustered index then you could try using DBCC CLEANTABLE to reclaim lost space

    MVDBA

  • MVDBA (Mike Vessey) wrote:

    is it a heap? (does it have a clustered index)

    if you process things nightly and delete data then the space can remain allocated but the data content is a lot less than the allocated space

    if you can't add a clustered index then you could try using DBCC CLEANTABLE to reclaim lost space

    ALTER TABLE ... REBUILD for a HEAP to recover the space.  If this is a full refresh operation - that is, delete all rows and reload - then use TRUNCATE instead of DELETE.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Not heaps. They do have clustered indexes as primary keys.

    Most are BULK inserted to.

    Rebuilding the indexes does help to recapture the space, but it only starts building again, which was not happening prior to the 2017 upgrade, so it's kind of like treating a cold with a tissue - keeps the nose clean, but the cold is still there. 🙂

    One of my google searches today stumbled across TRACE FLAG 692 (https://docs.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-traceon-trace-flags-transact-sql?view=sql-server-2017).

    Disables fast inserts while bulk loading data into heap or clustered index. Starting SQL Server 2016 (13.x), fast inserts is enabled by default leveraging minimal logging when database is in simple or bulk logged recovery model to optimize insert performance for records inserted into new pages. With fast inserts, each bulk load batch acquires new extent(s) bypassing the allocation lookup for existing extent with available free space to optimize insert performance.

    With fast inserts, bulk loads with small batch sizes can lead to increased unused space consumed by objects hence it is recommended to use large batchsize for each batch to fill the extent completely. If increasing batchsize is not feasible, this trace flag can help reduce unused space reserved at the expense of performance.

     

    This seems to have helped. I will report back once I am fully convinced.

     

  • Thanks for the update and please do post back. I'm sure I'm not the only one curious about this issue so any updates would be great.

    Sue

  • write it up as an article for the site, it's quite an interesting find. I'm sure steve would love it

    MVDBA

  • Yes, he would.\

  • jimkern wrote:

    Not heaps. They do have clustered indexes as primary keys.

    Most are BULK inserted to.

    Rebuilding the indexes does help to recapture the space, but it only starts building again, which was not happening prior to the 2017 upgrade, so it's kind of like treating a cold with a tissue - keeps the nose clean, but the cold is still there. 🙂

    One of my google searches today stumbled across TRACE FLAG 692 (https://docs.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-traceon-trace-flags-transact-sql?view=sql-server-2017).

    Disables fast inserts while bulk loading data into heap or clustered index. Starting SQL Server 2016 (13.x), fast inserts is enabled by default leveraging minimal logging when database is in simple or bulk logged recovery model to optimize insert performance for records inserted into new pages. With fast inserts, each bulk load batch acquires new extent(s) bypassing the allocation lookup for existing extent with available free space to optimize insert performance.

    With fast inserts, bulk loads with small batch sizes can lead to increased unused space consumed by objects hence it is recommended to use large batchsize for each batch to fill the extent completely. If increasing batchsize is not feasible, this trace flag can help reduce unused space reserved at the expense of performance.

    This seems to have helped. I will report back once I am fully convinced.

    Lordy... every time they make an improvement, there's some hidden cost.  Them making TF1117 the non-modifiable default for TempDB recently killed me during a major table upgrade because IDENTITY INSERT always causes a sort in TempDB.  Imagine 8 files all growing to 146GB just because one did.  In the past, the files never grew past 2GB each and we only have 100GB allocated to the drive.  I was able to pull off what we needed to do on an alternate machine that only has a single file without the same 100GB cap.

    The other thing that they don't come right out and say in the article that you linked to is, if you enable the trace flag globally, whether or not it will screw up old-style BULK INSERTs (and similar INSERT/SELECT/TABLOCK code) that do meet all of the other requirements for minimal logging.

    So, to re-coin the marketing phrase "It just runs faster... but it will cost you dearly in ways you could never imagine".

    They're making similar "improvements" in SSMS... 18.2 automatically displays the completion time for code runs and that "feature" cannot be disabled.  It would take too long to explain why but I now have to make a work-around for that.  That's not to mention the fact that they removed "Database Diagrams" (which they, very fortunately, put back) and they removed the "Debugger" feature, which they're not yet put back despite and incredible public outcry.

    You have to wonder where the hell they're getting such ideas from.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • At this point, nearly 2 weeks after upgrading from SQL-2014 to SQL-2017, and subsequently 5 days after enabling trace flag 692, I believe I can say that the trace flag has had a very positive impact on the unused space issue.

    As our daily partitioned tables begin to roll through our weekly retention policy, some of the "larger" partitions are traded out for some of the smaller ones now being created with the trace flag, and we are recovering unused space to the database as a whole instead of accumulating it into reserved space for these tables. We have recovered almost 400GB of space that way in 5 days.

    So, trying to gather some information as to what contributed to this in our application, since we cannot be the first database ever to upgrade SQL 2017, right? 😉

    The database that was growing rapidly was a "history" database. Our operational database was not hit nearly as hard with this issue. Just the archival database. So what was the difference? Our operational database is just that...an OLTP database run transactionally, one record at a time, updated, inserted, etc. The History database, however, has many tables that are batch-inserted to for archival purposes.

    Seems that would be a key factor in the impact of this change in behavior.

    In fact, the table most impacted seems to be a table that was batch-inserted to with a SINGLE RECORD size batch. Seems the smaller the batch size, the more the waste.

    After the changes (SQL version and TF692), a few of the tables now have MUCH smaller unused space now. 1/20th of what they were under SQL-2014. A few tables are still 3x what they were before under SQL-2014. But none of them near the size that they were under 2017 without the trace flag. One example, the unused space for one table went from ~10000 KB to a max of ~86679000 KB and settled back to about 28000 KB.  Another example table from ~50000 KB to a max of ~9345808 KB and has settled back to about 4500 KB of unused space now. Have not dug into the differences between those tables yet.

    I'm not a skilled article writer, but would love to summarize this some way, with some of the actual data that I have pre/during/post. Any tips on how to organize this to an article, as suggested?

  • Thanks for the detailed followup.  Just to be sure, though... the problem/observations that I posted appear to be the same problem and it's with 2016, not 2017.  Did I "get lucky"?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have a suspicion that the same behavior would have been seen in SQL2016, but no proof and no reason to personally research it at the moment. BTW...we are considering implementing TF1117 and TF1118 soon as well, or perhaps using the per-database option instead, but I want to proceed with caution.

    Jim

  • jimkern wrote:

    I have a suspicion that the same behavior would have been seen in SQL2016, but no proof and no reason to personally research it at the moment. BTW...we are considering implementing TF1117 and TF1118 soon as well, or perhaps using the per-database option instead, but I want to proceed with caution.

    Jim

    TF 1118 and 1117 are automatically implemented in 2016.  You can turn them off for all databases except for TempDB.  I wouldn't turn 1118 off anywhere but I would like to disable 1117 on TempDB.  Unfortunately, you can't.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    TF 1118 and 1117 are automatically implemented in 2016.  You can turn them off for all databases except for TempDB.  I wouldn't turn 1118 off anywhere but I would like to disable 1117 on TempDB.  Unfortunately, you can't.

    I think people need to read and fully understand the changes in 2016 especially for the new extensions
    TF 1118 OFF = Database setting MIXED_PAGE_ALLOCATION ON
    TF 1118 ON = Database setting MIXED_PAGE_ALLOCATION OFF (DEFAULT)
    depending which you use or don't use, it is very confusing.
    Especially if they try use the extension assuming ON/OFF mean the same thing as the trace flag.
    I've never set TF 1118 ON and all my databases are set to the default MIXED_PAGE_ALLOCATION OFF

    except for master, model and msdb.

    Far away is close at hand in the images of elsewhere.
    Anon.

  • David Burrows wrote:

    Jeff Moden wrote:

    TF 1118 and 1117 are automatically implemented in 2016.  You can turn them off for all databases except for TempDB.  I wouldn't turn 1118 off anywhere but I would like to disable 1117 on TempDB.  Unfortunately, you can't.

     

    I think people need to read and fully understand the changes in 2016 especially for the new extensions
    TF 1118 OFF = Database setting MIXED_PAGE_ALLOCATION ON
    TF 1118 ON = Database setting MIXED_PAGE_ALLOCATION OFF (DEFAULT)
    depending which you use or don't use, it is very confusing.
    Especially if they try use the extension assuming ON/OFF mean the same thing as the trace flag.
    I've never set TF 1118 ON and all my databases are set to the default MIXED_PAGE_ALLOCATION OFF

    except for master, model and msdb.

    Correct and my bad for referring to TF 1117 and 1118 for SQL Server 2016.  I do mean MIXED_PAGE_ALLOCATION OFF when I say that TF 1118 is automatically enabled and I should get out of that habit.  To be sure, if you do include either TF 1117 or TF 1118 in the startup for SQL Server, you'll find an error in the error logs stating that the trace flags are no longer supported and are ignored in the startup.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply