Index fill factor

  • Hi Experts ,

    We have around 3000 indexes in our database . We have introduced a index maintenance plan to rebuild the indexes . Now while rebuilding the indexes we have option to specify the FillFactor  .

    My question is that , Is there a way to identify ideal FillFactor for all the 3000 indexes using some system tables or DMVs.

    Thanks in advance.

     

     

     

     

     

     

     

  • You say that you're "REBUILDing" indexes.  Do you really mean that or are you including REORGANIZE, which I refer to as "Death by defragmentation"?

    Obviously, I recommend NOT using REORGANIZE.  It's just NOT the tame little kitty that everyone thinks it is and, even though BOL says what it does, most people misinterpret what it does.  It doesn't actually do what most people think it will do and, in the end, it actually causes much more log file activity than a REBUILD and, in a lot of cases, actually sets up the index to perpetuate page splits, which is the leading cause (but not the only cause) of fragmentation.

    I don't know about anyone else but I need more information in order to help...

    1. Are you using any method (mirroring, log shipping, AG, etc) where it is mandatory that your databases always be in the FULL Recovery Model and you cannot make a temporary shift to the BULK LOGGED Recovery Model.
    2. Which edition (Standard or Enterprise) and which version of SQL Server are you using? (EDIT... looking a back at this post, I assume you're using 2014 but I still need to know the edition)
    3. What is the total size of the database in GigaBytes?
    4. What is the size of the top 10 indexes in GigaBytes?
    5. How much contiguous free disk space does your system have in GigaBytes by drive?
    6. How much time is available each week for "down time" where a given index can be offline?
    7. Are you using SQL Server 2016 SP2 CU11 or above (there's a new problem with that or anything above when using the ONLINE option for rebuilds that I've discovered over this past weekend). EDIT... looking a back at this post, I assume you're using 2014)
    8. When was the last time you did any index maintenance?
    9. When was the last time you did any statistics maintenance?
    10. When was the last time that either the server was totally rebooted or the last time that the SQL Server service was restarted?
    11. How often do you do either 9 or 10 above?
    12. Is the server "On-Premise" or in the cloud?  If in the cloud, what service are you using (AWS RDS, AWS EC2, Azure, or ????)
    13. Are any of the tables/indexes partitioned?
    14. What does your index maintenance plan currently consist of?  Please be specific.

    I know that sounds like a lot of information but a lot of people end up destroying themselves by assigning Fill Factors in a generic manner.  For example, a lot of people will say that if you have a table/index that fragments quickly, all you have to do is decrease the Fill Factor.  That can be a huge waste of memory and disk space especially if you have ever-increasing index keys and "ExpAnsive" updates, the latter of which queers the whole deal.  The same goes for ever-increasing "siloed" indexes, which cannot be made to fragment less no matter what you do to the Fill Factor.

    And if you're using the current "Best Practices" of reorg-ing between 5 and 30% fragmentation and rebuilding at 30% or greater, you're actually doing it wrong because, despite what the rest of the world has come to accept, they're not actually "Best Practices".  It was actually meant to be just a starting point for people that are mostly ignorant of indexes.  Having done the experiment over a period of 4 years, I can tell you it's actually better to not do any index maintenance and only rebuild statistics than it is to use what people have come to accept as "Best Practices".  And, no... I'm not recommending not ever doing index maintenance... I'm only recommending that it's better to do nothing but rebuild statistics than to do it wrong using current supposed "Best Practices".  You need to do index maintenance correctly if you're going to do it at all.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There isn't. In general, you might keep 100 for those that are mostly read only. For those that have lots of updates, you might try something in the 85-90 range. I wouldn't go below 80.

    I'm sure Jeff Moden will chime in here as he has done a lot of testing on fill factors and indexing.

  • Heh... there actually is a way. 😉  I've presented the method to just a couple of PASS chapters and I'm working on some proposed articles for a "Stairway".

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • There's no magic formula to tell you the best fillfactor for 3,000 indexes.  Or for 10 indexes, for that matter.

    Taking a step back for a broader look, the single most important thing from a performance standpoint is determining and implementing the best *clustered* index for that specific table and its usage.  Slapping an identity column on every table by default, with no consideration of that specific table, to act as the clustering key is a terrible practice, no matter what other experts might try to say.  That always-cluster-on-identity is the biggest myth in table creation (it's not really a "design" practice, per say, although many treat it as such).

    Data compression is the next biggest factor, assuming you have the usual I/O-bound and not the very rare CPU-bound work load.  Page compression can drastically reduce I/O.  But, note that it will then often take much longer to do a rebuild or possibly even reorg of the table.

    Skipping a genuine logical table design before creating the physical table is a huge damaging factor in table performance too.  It results in overly wide tables, which greatly contributes to all the performance problems tables end up having.

    Your main concern is next on the list (hooray!), since fillfactor is likely the next factor to consider:

    don't set the fillfactor too low by default;

    use every reasonable measure to can to avoid expanding varchar/varbinary columns, as these can force page splits no matter how high the freespace;

    if you must have more freespace in the active portion of your table, then, if you can, partition the table so you can use a (much) higher fillfactor on older data.

     

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • As to reorganization, I believe it's still best practice to use reorg for columnstore tables (I'm on SQL 2016).   Full rowgroups won't be affected at all by this process, but partial rowgroups will be combined into more efficient, larger rowgroups where possible.  This will require recompression, but for a limited number of rowgroups.

    Rebuild will (or at least used to) rebuild all rowgroups in the columnstore, and all data will be unavailable while being rebuilt.  That much compression could take a lot of time.

    I've often had need to reorg a row store table as well.  For example, the rare case where a massive table has had a lot of deletes, but it can't be done long enough to fully rebuild what's left.  A reorg can be useful then, although it can generate a rather large volume of log activity, it can still be much better than leaving the table longer with massive gaps in data.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • I really wish SQL would add an option to rebuild within a specified key range.  That would allow the benefits of a full rebuild without the downside of forcing it to be done on all rows in the index.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    There's no magic formula to tell you the best fillfactor for 3,000 indexes.  Or for 10 indexes, for that matter.

    Actually, for non partitioned rowstore indexes, there is (it takes a couple of formulas though).  You just don't know about it yet. 😀

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ScottPletcher wrote:

    As to reorganization, I believe it's still best practice to use reorg for columnstore tables (I'm on SQL 2016).   Full rowgroups won't be affected at all by this process, but partial rowgroups will be combined into more efficient, larger rowgroups where possible.  This will require recompression, but for a limited number of rowgroups.

    Rebuild will (or at least used to) rebuild all rowgroups in the columnstore, and all data will be unavailable while being rebuilt.  That much compression could take a lot of time.

    I've often had need to reorg a row store table as well.  For example, the rare case where a massive table has had a lot of deletes, but it can't be done long enough to fully rebuild what's left.  A reorg can be useful then, although it can generate a rather large volume of log activity, it can still be much better than leaving the table longer with massive gaps in data.

    I can't speak to REORGANIZE for column store indexes but, for rowstore indexes, you're spot on.  I have a 250GB clustered index where they do deletes slightly after they do a wad of updates (poor design of code there).  It has a maximum fill of 93% page density (the rows are wide enough to only allow 9 rows per page and the pages don't fill more than that, in this case).  Because of the deletes, it was wasting about 30 GB and the page density on the most active spot on the clustered index was tanked at about 15% page density.  The average logical fragmentation was only a little over 4% but I did a REORGANIZE on it in a copy of the database... it caused the log file to grow well over 400 GB (it was a test so I had log file backups turned off as a simple way to measure how much log file it actually used) and took about 2.5 hours to execute and it didn't even fix all of the low page density pages.

    What I really wish is that they'd fix REORGANIZE. 😀

    BTW, since the database isn't being replicated, I slipped it into the BULK LOGGED recovery model and rebuilt it (in another test on another copy) and it only too a little over 20 minutes and the log file only grew to about 3GB.  The test logfile backup after that was 225 GB, though (don't know where it came from because the log file wasn't that large).  A full backup took over an hour and grew the logfile to a little over 250GB, which is still a whole lot less than more than the 400GB REORGANIZE produced.

    I also tried an ONLINE REBUILD... the log file was about the same as a normal rebuild and but took nearly twice as long.  It also turns out that there appears to be a problem they built in to 20016 SP2 CU11 when they made "fix" for something having to do with ONLINE REBUILDs because the damned thing totally inverted the index with more than 97% fragmentation and segment sizes of 1.02 average pages.  I'm doing some more research on that before I submit a feedback problem.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Many of my dbs are in simple mode since, for those for the most part, we can reload data from the past 12 hours rather easily.  So I can have a diff backup done twice a day rather than having to do full mode.  That keeps from having any log issues, and typically makes reorg vs rebuild an even bigger savings.

     

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • MS has actually provided complex formulas for years.  But it's so difficult to come up with accurate numbers to go into the formulas, and the result is only as good as the estimates that go into it, much like for the optimizer.

    It's not really even that I guess you can't come up with some type of formula.  It's that people focus first on the reorg/rebuild question when they've ignored far more important considerations -- including best *clustered* index, data compression (or encoding, if compression isn't available), and doing proper designs of tables before creating them.  Only then should you worry about other index issues.

    I'm still a bit puzzled as to why my 30 years of experience is effectively meaningless, as you've relayed to me before, whereas your 4 years of experience is absolutely definitive.  Of course, I'm looking forward to reading your article(s) when they do come out.

    • This reply was modified 3 years, 10 months ago by  ScottPletcher.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher wrote:

    MS has actually provided complex formulas for years.  But it's so difficult to come up with accurate numbers to go into the formulas, and the result is only as good as the estimates that go into it, much like for the optimizer.

    It's not really even that I guess you can't come up with some type of formula.  It's that people focus first on the reorg/rebuild question when they've ignored far more important considerations -- including best *clustered* index, data compression (or encoding, if compression isn't available), and doing proper designs of tables before creating them.  Only then should you worry about other index issues.

    I'm still a bit puzzled as to why my 30 years of experience is effectively meaningless, as you've relayed to me before, whereas your 4 years of experience is absolutely definitive.  Of course, I'm looking forward to reading your article(s) when they do come out.

    To be sure, Scott... I've not dismissed your experience any more than you seem to be dismissing mine.  I have almost the same amount of experience as you and not just the 4 years you state.  I started working with SQL Server when 6.5 was the latest and greatest.  I also agree that selecting the correct keys, especially for Clustered Indexes, is paramount.  I just happen to frequently disagree with you on what those should be. 😉

    You state that "MS has actually provided complex formulas for years. " ... do you have any links to such recommendations?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    Heh... there actually is a way. 😉  I've presented the method to just a couple of PASS chapters and I'm working on some proposed articles for a "Stairway".

    Now, this is a stairway I can't wait to read. Thanks for working on that Jeff.

  • Tom Uellner wrote:

    Jeff Moden wrote:

    Heh... there actually is a way. 😉  I've presented the method to just a couple of PASS chapters and I'm working on some proposed articles for a "Stairway".

    Now, this is a stairway I can't wait to read. Thanks for working on that Jeff.

    Thank you for the encouragement!

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply