Index Fragmentation?

  • I'll try to be as concise as possible and lead you down the path i've traveled :).

    I'm in the process of investigating some recent perf problems we have had with SQL Server 2008 R2. One of the things i've looked at is FillFactor. We have a standard of 80 on all indexes which according to what i've been reading is a bad idea. From what i've read you should be looking at index fragmentation and checking how quickly indexes are being fragmented to determine fillfactor and gradually lower it until fragmentation of indexes is within an acceptable parameter.

    So the that got me looking into index fragmentation of our indexes. What i discovered astonished me. Our fragmentation was horrible. A lot of them being 95-99%.

    I talked to our dba about it and he said they have ran index rebuilds or defrags for about 8 years. We used to do it on Sunday when we expected our traffic would be low and found out they were taking too long (hours). This was before online index rebuilds. It was determined at the time to just cancel index rebuilds.

    I then ran into Brent Ozars article on "Why index fragmentation doesn't matter"

    and i want to make sure im interpreting what he is saying correctly. So us never rebuilding/defragging indexes is ok? Does index fragmentation affect query plans? Could it cause some bad query plans that otherwise would be ok? If your supposed to use index fragmentation as a gauge of what to use for fill factor how can this be done if you never rebuild indexes?

  • I wouldn't say that.

    Fragmentation is not that massive of a concern, but rebuilding indexes also removes the half-full pages that you tend to get as a result of page splits. It also does stats updates.

    If you don't want to rebuild, OK (though I don't entirely agree with Brent about that), but then you probably should do manual statistics maintenance as the default 20% auto-update is not good enough in most cases.

    Fragmentation does not affect query plans as the optimiser has no knowledge of fragmentation. Bad stats on the other hand, huge effects.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (7/9/2014)


    I wouldn't say that.

    Fragmentation is not that massive of a concern, but rebuilding indexes also removes the half-full pages that you tend to get as a result of page splits. It also does stats updates.

    If you don't want to rebuild, OK (though I don't entirely agree with Brent about that), but then you probably should do manual statistics maintenance as the default 20% auto-update is not good enough in most cases.

    Fragmentation does not affect query plans as the optimiser has no knowledge of fragmentation. Bad stats on the other hand, huge effects.

    Thanks for the reply. Ya we do an sp_updatestats every night. How about trying to determine fill factor without using indexes fragmentation statistics. Is it possible or just a guessing game at that point? One thing i know i can change is fillfactor for our primary keys (int). Those are set at 80% also. Those of course can safely be set to 100%. Do you agree that a standard of 80 for all indexes is generally a bad idea?

  • I'm pretty mixed on index fragmentation. Unfortunately, I've seen it cause problems, so, if I can defragment indexes for little or no cost, I'm going to do it. If the cost is too high, then the evaluation process starts.

    The biggest question I'd have is what is the behavior of your queries like? If you've got pretty precise OLTP style queries that are doing point lookups with very few range seeks and little to no scans, fragmentation is unlikely to hurt you at all. If you're getting a goodly number of scans or lots of scans, fragmentation could be causing a great deal of pain. Before I made the choice to toss defragmentation completely or even cut back on it, I'd suggest gaining that understanding first. A quick and dirty, and not very accurate way to do this, is to look at the sys.dm_db_index_usage_stats Dynamic Management View (DMV). It will at least give you indications on scans versus seeks on your indexes. For accurate measures of this, you'd need to look at query plans on particular tables & indexes.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Ncage (7/9/2014)


    Ya we do an sp_updatestats every night

    So a sampled update. That may be fine, or it may be terrible. Can't tell. Take a look at your poorly performing queries and make sure you don't have cardinality problems.

    Personally I still schedule index rebuilds on all servers. But I schedule targeted ones that rebuild just what needs rebuilding and only larger tables.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (7/9/2014)


    Personally I still schedule index rebuilds on all servers. But I schedule targeted ones that rebuild just what needs rebuilding and only larger tables.

    +1

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • The single biggest factor in overall query performance is getting the best clustered index on each table (hint: it's rarely an identity column). Typically you'd start with a higher number than 80%, more like 90-95%, unless you do a large percentage of random INSERTs and/or you frequently UPDATE varchar column(s) and lengthen them (in which case you might be far better 'pre-padding' the columns to avoid too many column expansions that can cause page splits).

    If possible, you want the first (and often only) clustering key column to be naturally ascending, such as a calendar date time or an identity column (but warning: as implied above, identities are vastly over-rated as a clustering key, and on average are best for only about 15-25% of tables).

    For tuning, you need to look at the index usage and missing index stats provided by SQL Server. If you want assistance with that, let me know.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • ScottPletcher (7/9/2014)


    ...identities are vastly over-rated as a clustering key, and on average are best for only about 15-25% of tables

    To that I'll say [font="Arial Black"]"IT DEPENDS"[/font].

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Grant Fritchey (7/9/2014)


    GilaMonster (7/9/2014)


    Personally I still schedule index rebuilds on all servers. But I schedule targeted ones that rebuild just what needs rebuilding and only larger tables.

    +1

    I mostly agree with that. I schedule index maintenance on all of my servers. I opt for custom rebuild jobs as well. But I have seen significant impact by fragg'd indexes on smallish tables (as few as 10 pages). I just had a client experiencing a production outage due to an index at 99% fragmentation, 250 pages (I know it is a lot different than 10 pages but still smallish). Defrag brought their app out of the weeds. Tried a stats update prior to that and it helped marginally, but less than the subsequent defrag.

    I think the key is to understand the data, the use and be prepared to still defrag. 😉

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply