Jeffrey Williams wrote:
If I read this correctly - it seems you are saying that index maintenance needs to be based on that system. And based upon the data access patterns and table structures - and how fast/frequently the table becomes fragmented to a point where it can then (potentially) cause performance issues.
Yes and no. What I'm saying is that if you're following currently accepted "Best Practices" of REORGANIZING when Logical Fragmentation is between 5% and 30% and REBUILDing for anything greater than 30%, you're very likely doing it all wrong. And it's not the "access patterns" that you need to be concerned with. It's the INSERT, UPDATE, and DELETE patterns (collectively "modification pattern" from here on) that you need to be concerned with because those are the only things that cause fragmentation. You also need to understand that there is both Logical Fragmentation and (what people call) Physical Fragmentation (which I simply refer to as "Page Density").
You also have to understand that the only tool that we currently use to detect either type of fragmentation is based only on averages and really provides very little insight as to what the actual modification pattern actually is. There are some types of indexes where you can make a really good guess. For example, if you have an index that has had no modifications at all (or a very tiny number) for, say, 4 weeks, there's a pretty good bet that it's either a static reference table, a rarely updated almost static reference table, or an unused table (you also need "no reads" for the latter to be true). Those can be rebuilt at 100% just to mark them as Static/Nearly static (and I DO use the Fill Factor to classify the modification pattern to control the index maintenance without the use of a separate permanent table).
I mark ever increasing indexes that have little or no fragmentation (<=1%) after 4 or more weeks that suffer modifications with a 99% Fill Factor so that I know what they are. They can still become logically fragmented over time simply because the underlying pages aren't aren't always going to be physically contiguous and I'll rebuild them if they go over 1%, which can take years to get to that point.
If similar ever increasing indexes suffer a lot of Logical Fragmentation or low Page Density, then "ExpAnsive" UPDATEs are the likely cause and I'll typically rebuild those with a 97% Fill Factor (the "7" looks like a "2" without the "foot", I'll explain that in a minute) so I recover the disk/memory space but know I have some work to do on them to prevent "ExpAnsive" UPDATEs in the future. Lowering the Fill Factor for these does absolutely nothing except to waste disk/memory space.
There are some indexes that always have a very high page density no matter how many rows you add but can fragment logically fragment to more than 20% literally overnight. If you do an analysis of the index keys, you'll probably find out that they're what I call a "Sequential Siloed" index. There's NOTHING you can to prevent the fragmentation on these because they like an "Ever-Increasing" index but they have multiple (sometimes hundreds) of insert points (not to be confused with "Evenly-Distributed" indexes, which can be fixed just by changing the Fill Factor). I rebuild these at 98% (the "8" laid on it's side looks like the infinity symbol for "always fragmenting forever") just so read aheads can work faster and I'll wait until they get to 10% to rebuild them. That sometimes means they get rebuilt every night if you do index maintenance every night.
Then there are indexes based on Random GUIDs and other very evenly distributed modification patterns. These are the ones that I'll lower the Fill Factor for. Depending on their modification rate, I'll set these to either 91%, 81%, or 71% and rebuild them when the go over 1% logical fragmentation. The "1" in the Fill Factor tells me they're "Evenly Distributed" so I know to rebuild them when they go over 1% logic fragmentation. And, no... this isn't the waste of memory or disk space you might think. We don't have the time nor space on this thread to prove it but (for example) an 81% Fill Factor will AVERAGE out at 87% memory/disk usage and have a max usage of nearly 100% (whatever the number of rows per page dictates) before they even start to fragment (which is why I rebuild them at 1% Logical Fragmentation).
Of course, "ExpAnsive" UPDATEs, out of order INSERTs (Random Silos, not GUIDs or Sequential Silos, etc) and DELETEs all play hell with all the above. The only way to fix them is to look at the index pattern of usage at the PAGE level. I've designed a tool that I call "IndexDNA™" that will actually let you see what the usage by page is in graphical form. I also created another tool that will auto-magically create an accumulator table and a trigger on the table you want to check for "Expansive Updates" and a stored procedure to help you do the analysis on the variable with columns that suffer expansion so you can decide how to fix the "ExpAnsive" UPDATEs.
So what do I do with those unknown indexes if I don't have the time to analyze them or fix them? I set their Fill Factors to 92% or 82%. The "2" in the Fill Factors is an indication to me that I don't know what the hell the index is doing but I wanted to recover the memory and disk space they're wasting while still giving them the opportunity to fill pages full without splitting (if that's what they're actually doing but don't know, at this point). Think of the "2" as "to do" because I eventually need to get to them especially if they go nuts on low page density a lot and they're big indexes (fat, long, or both).
I always do an IndexDNA™ on the really big indexes if they fragment a lot or suffer page density issues. For example, I have one Clustered Index that's now 200GB and fragments both logically and by low page density a lot even though it has an ever-increasing key. sys.dm_db_index_physical_stats and sys.dm_db_index_usage_stats would lead you to believe you have "ExpAnsive" UPDATEs working against you. That wasn't it at all, though. It turns out that there are NO "ExpAnsive" UPDATEs taking place on this index. The problem is that the people that designed the code that populates the table don't actually know how to do certain things and so they do a shedload of INSERTs and they follow that with a shedload of DELETEs. Without me figuring that out using IndexDNA™ and the "ExpAnsive" UPDATE analysis tools I built, I would have made the HUGE mistake of lowering the Fill Factor on this table. Instead and following my own rules, I rebuild it at 97% because they're never going to fix the damned thing. Since they only delete from the data they just inserted, I may be able to partition the table based on temporal keys so that I don't have to do such a large (>200GB) rebuild.
And, BTW, doing a REORGANIZE on this table when it was only 146GB caused my log file to explode from 20GB to 227GB even though the index was "only" 12% logically fragmented.
It'll be a cold day in hell (sometimes spelled as "Inrow LOBs", which I also have a fix for) before I use REORGANIZE ever again (it was also the leading cause of a lot of the major blocking I went through in Jan 2016 even on things that Fill Factors assigned).
Jeffrey Williams wrote:
If you have a table that uses a multi-column PK as the clustered index and generally reaches 20-30% fragmentation in under 7 days - then rebuilding that index once a week with an appropriate fill factor (not 100%) would be correct for that table?
Ok, so for you question above, what do you now think the answer is? If your answer isn't in the form of "it might be this or that but the real answer is "It Depends"", you'd might be wrong. 😀
Jeffrey Williams wrote:
On the other hand - if you have a table with an IDENTITY as the PK clustered index and that table is only ever inserted to (infrequent or no updates) then it doesn't need to be rebuilt? If batch inserts and parallelism comes into play - then rebuilding after the batch insert would be warranted?
Again, look back at what I wrote in the "rules" above and realize that, truly, "It Depends" and there is no panacea recommendation that anyone can make (especially since you can have multiple modification patterns on the same index from different sources). Like I said, I'll typically rebuild Static/Nearly Static tables at 100% and non-ExpAnsive but frequently inserted indexes at 99%. Both can go years without needing defragmentation (even if parallelism comes into play because inserts are "gathered" prior to the insert happening). If "ExpAnsive" UPDATES occur, I would have rebuilt the index at 97% and it could be a great idea to rebuild the index after large batch of UPDATEs but not after a large batch of INSERTs because it's ever-increasing and the end-of-index inserts will always fill the pages to as close to 100% as possible prior to creating a new page.
You also don't want to rebuild a 200GB Clustered Index just because such an event happened because you might be crippling everyone that's using the new data.
"Ah!", you say... "I'm going to use the ONLINE option because I have the Enterprise Edition". Heh... good luck with that. ONLINE rebuilds don't use as much log file as REORGANIZE does but, on a 146GB Clustered Index, you can expect ONLINE Rebuilds to use 146GB of log file as well as 146GB of data file space just like an OFFLINE Rebuild would. You're going to need to make a better plan like, partitioning the table or whatever.
The bottom line is that you "Must Look Eye" because, as is true with everything else in SQL Server, "It Depends" and "Nothing is a 100% panacea".