Standards Matter

  • Comments posted to this topic are about the item Standards Matter

  • can you please share more of the 'Ola Scripts' that you mention? I'm having a hard time digesting the who, what, when where and why of those scripts, and I want to get to know them better.

    I just HAD to ask.

  • I liked the article and made me think of the acronym of KISS (Keep It Simple and Standard).

  • One of the problems with Ola's utilities is that it relies on arbitrary limits to determine when to use REORG vs REBUILD.  In fact, you are probably going to hear from several that you should never use reorg - at all.  There are a few valid reasons to use reorg - but consider those as edge-cases only (for example LOB COMPACTION is one edge-case).

    So that brings up the question of how large a transaction log you need.  Well - that is actually quite simple - if you need a 1TB drive to support normal daily/weekly/monthly operations then that is the size you need.  There is nothing inherently wrong with having a large transaction log and if grown out appropriately so that you don't have too many or too few VLF's then it won't even affect recovery operations.

    There are other techniques available to help reduce transaction log usage - for example, switching to bulk-logged recovery when performing index rebuilds can reduce log usage, but at the expense of being able to recover to a point in time during that period.  You also have the option of rebuilding ONLINE vs OFFLINE - and if you do have that option then using ONLINE instead of REORG is a much better option.  If you have a maintenance window where you can take the table offline - then using OFFLINE rebuilds are often much faster than ONLINE and also utilize less transaction log space.

    As to standards - absolutely agree.  One item you sort of touched on was file placement - you should have a standard for all installations for your storage layout.  I personally like to see a minimum of 3 drives:

    • System DB drive (hosts the system databases and files only)
    • User DB drive (hosts user databases)
    • Backups

    For smaller systems the above is a good start.  Generally systems with an overall size less than 100GB where usage is also minimal - meaning we don't see a large number of transactions.  Again - this is just a general starting point.

    For larger systems - a minimum of 5 drives:

    • System DB's
    • User Database Files (mdf/ndf)
    • User Transaction Logs (ldf)
    • TempDB
    • Backups

    And for very large systems - we start with the above and add mount points as needed.

    Note: if you have a very good network then backing up across the network is feasible.  In that case, you wouldn't need a backup drive - but I would still include the drive so we have something available for special cases where we need an off schedule backup (e.g. prior to an upgrade).  That way we can perform copy only backups without interfering with the scheduled backups.

    Jeffrey Williams
    “We are all faced with a series of great opportunities brilliantly disguised as impossible situations.”

    ― Charles R. Swindoll

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Without some form of enforcement, standards don't matter at all except, possibly, when it comes to needing a reason to fire someone.  Urgencies are not justification not to follow a proven good standard.  In fact, those are the times when you most need to follow the standards.

    Heh... and then there's the case of standards being incorrect or not recognizing exceptions to the rule.

    A good example of the latter is that most DBAs know that Trace Flag 1117 (balance files) and Trace Flat 1118 (creation of extents instead of pages, essentially) were a really, really good "Best Practice".  It was such a good idea, that they made it the default for all databases.  That was a good idea, as well.

    What's a really, really BAD idea is that they allowed for NO EXCEPTIONS in Temp DB because they couldn't envision a case where it might be a bad idea.

    What they forgot about is the bug where, even if you're using minimal logging to copy a table with an IDENTITY column and you've used SET IDENTITY_INSERT tablename ON, it s that it will sort the entire Clustered Index in Temp DB.... just imagine my joy when I'd been operating for half a decade on eight 2GB files for Temp DB for years and then I try to copy a 250GB table.

    Good standards are good IF they allow for the occasional unforeseen exception, they're enforceable, and they're always enforced through peer reviews, QA, etc, etc.  To Jeffrey Williams point above, bad standards are still bad no matter how many people and "experts" say they're a "Best Practice".

    And example of the latter are the supposed "Best Practices" of doing REORGANIZE between 5 and 30% logical fragmentation and doing a REBUILD only after 30%.  Most of us bought that hook line and sinker because, especially way back when, we bought what Microsoft posted in their documentation... except there was some unfortunate wording in the "suggestion" they made and very few people either read the foot notes or ignored them because of the unfortunate wording be fore that.  It turns out those supposed "Best Practices" were solely responsible for my log file sizes, the major blocking that I experienced on Monday, the 18th of January 2026, and the actual reason why I had to do so much defragmentation.  In fact, those supposed "Best Practices" are the real reason behind the massive fragmentation for Random GUID Clustered Indexes.

    I never understood things like what REORGANIZE actually does (it does NOT follow the Fill Factor in a manner that so many people think it does) and I never understood that the use of REORGANIZE is the main reason why so many indexes still fragment even if lowering the FILL FACTOR on them to help prevent fragmentation is actually a good idea.

    It hints at all of that in the documentation of what people have incorrectly adopted as a "Best Practice".  And, that's an example of a viral but very bad practice, especially since the original author of the information that was documented said in a separate article as follows (the emphasis is mine):

    "They’re not set in stone – they’re a big generalization, and there are a ton of other factors that may affect your choice of threshold and fragmentation removal method ... {snip} ... In the meantime, TAKE THOSE NUMBERS WITH A PINCH OF SALT AND DON’T TREAT THEM AS ABSOLUTE.".

    Here's the link to that article... it was written way back in 2009 and yet almost everyone as been using a "Best Practice" that was meant to be a "starting point" and not a blanket "Best Practice", which is actually a "Worst Practice" if you use as a panacea.

    https://www.sqlskills.com/blogs/paul/where-do-the-books-online-index-fragmentation-thresholds-come-from/

    And, no... I am NOT saying that standards are a bad thing to have.  You MUST have standards but you must also enforce them (especially during urgencies) and you MUST allow for caveats to handle the extraordinary.  The standards we wrote for where I work have a detailed list of the people (by position or station) on who can make a decision to "Override" a standard and to what extent and what the followup actions must be when such an "Override" occurs.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Just to refer to the orginal problem:

    The root problem of the big log file could be fixed / prevented, if you would partition your tables. Ola's script would in this case only defragment one partition after the other (and of course only fragmented partitions, leaving your old, unchanged data untouched).

    Of course partitioning may bring its own complexity (e.g. regaring unique indexes, which MUST include the partition column) and ideally you need to have the partition column in every WHERE or JOIN (can speed up the whole query by a lot, while your queries without specifying the partition column may become a bit slower)

    God is real, unless declared integer.

  • Thomas Franz wrote:

    Just to refer to the orginal problem: The root problem of the big log file could be fixed / prevented, if you would partition your tables. Ola's script would in this case only defragment one partition after the other (and of course only fragmented partitions, leaving your old, unchanged data untouched).

    Of course partitioning may bring its own complexity (e.g. regaring unique indexes, which MUST include the partition column) and ideally you need to have the partition column in every WHERE or JOIN (can speed up the whole query by a lot, while your queries without specifying the partition column may become a bit slower)

    That will help in the long term... well... unless you continue to blindly use REORGANIZE and continue to think that any index maintenance solution is some form of a panacea.

    Again, Step 1 for all of this would be to stop using REORGANIZE.  It's not the resource-gentle kitty that everyone makes it out to be and it actually perpetuates fragmentation by removing any free space below the Fill Factor of an index by compressing everything UP to the Fill Factor and, because it will not create new pages, does absolutely NOTHING to clear the area above the Fill Factor for those indexes that would actually benefit from lowered Fill Factor.

    The other thing is that if you have indexes that are fragmenting that have a "0" or "100" Fill Factor, stop doing any index maintenance on them base on logical fragmentation.  All you're doing is setting up for MASSIVE fragmentation by remove all free space from above the Fill Factor.  In other words, you're not only perpetuating fragmentation, you're also making it a thousand times worse on the proverbial "morning after".

    In fact, until you realize that there is no panacea method for index maintenance, stop doing index maintenance.  You're just making matters much worse than they should be for pages splits, log file usage, blocking, etc, etc.  Just rebuild stats and any indexes that have a page density of less than 80% and then, when you DO rebuild those, rebuild them at 82%.

    It's what I did from the 18th of January 2016 until January of 2020 not only with no ill effects but performance actually got lot better, log file usage plummeted, etc, etc.  I've lately taken to doing some index maintenance to help prevent fragmentation altogether but I went 4 years without doing any with great success.  In other words, the only time I did index maintenance was for space recovery and then I left headroom to try to decrease fragmentation.   Even that is a bit stupid if you have ever-increasing indexes where you do inserts and follow shortly after that by updating the hotspot.  In such an instance, the hotspot will ALWAYS fragment like crazy until you fix what's causing the "ExpAnsive" updates.

    Get off the drug known as "index maintenance"... If you're not going to do it right (and the generally accepted "Best Practices" is doing it the worst way possible... they were never meant to be a "Best Practice") or you insist on using REORGANIZE because you think it's "better" or you continue to blindly use REBUILD without understanding the true nature of each and every index, stop doing it because you're doing it wrong and it's absolutely killing you.

    If that sounds a bit brusk or insistent, you've read it correctly... Do it right or stop doing it.  It's like a bad drug.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply