Making Indexes Invisble to the Optimiser

  • kevaburg - Monday, October 23, 2017 11:54 AM

    Jeff Moden - Saturday, October 21, 2017 3:09 PM

    kevaburg - Saturday, October 21, 2017 12:56 AM

    Another reason I want to avoid hints is this:  I attended a lecture from a MCM here in Germany and he made this statement:
    "If you feel the only way to solve a problem is to use a hint you are making a conscious ascertation that you understand the optimiser better than the 200-300 professional developers in Redmond that developed it".

    Heh... You mean the same folks that developed the crippled PIVOT even though MS has a great model of a non-crippled version in ACCESS... the ones that crippled performance with FORMAT... the ones that made recursive CTEs run slower and more resource intensive than a While Loop... the ones that created a splitter function with no guarantee of sort order and no return of element position... the ones that cause online rebuilds to sometimes corrupt the clustered index in 2012... the ones that refuse to have a built in Tally Table function because they don't understand how it would be used... the ones that keep deprecating functionality and frequently replacing it with something that doesn't work as well... the ones that caused SSIS installations to go corrupt after an SP release... the ones that still haven't figured out how to do a Peter Norton style shrink of a database even though the technology has existed for more than 3 decades... the ones that still haven't created an easy to use true CSV file importer for T-SQL.. the ones that have to issue CUs faster than most people balance their checkbook to fix stuff on a product that's been around for 30 years... or the ones that blew some major portions of the code at work out of the water with their new cardinality estimator?

    I have a huge appreciation for the problems associated with complexities of the code behind SQL Server but the folks in Redmond aren't gods (and that's not a slam against them but could be in the future if they keep going the way they're going 😉 ).  You don't actually have to know or understand the optimizer as they claim to know when it's done something wrong and needs some help and that's a part of the reason why they did include index hints.

    To wit, I take strong exception to the statement you quoted because sometimes the answer is "The people in Redmond don't have to solve this particular problem... I DO"!!

    Wow.....I feel like I have just been verbally body-slammed! 😀

    I can appreciate your comments without any qualms at all although the problems you have outlined are not problems I have (thankfully) had to deal with up until now.  But the sorts of issues that SQL Server has are not unlike those found in Oracle (now we are talking about problems......ORA600, ORA7445) but then this is also the reason for my original post.  Testing at any level will produce results that can sometimes digress from what we would regard as normal in a production environment and I have seen this in the cardinality estimator first hand.  In this case though, reverting back to the legacy estimator solved the problem.

     I am not regarding the Redmond guys as Gods, we all know they aren't but they work under a different environment (sandboxed, safe, call it what you will) and can't test every permutation of every possible outcome so problems will occur especially in programs as large and complex as Microsoft products have become over the years.  My OP was based on finding a way to help alleviate a problem associated with performance optimisation with minimal risk and here, how the optimiser works and selects an index plays a large role.  All I wanted was quite simply to remove the index from the list of available indexes to choose from without losing control should I need to roll back.

    Up until now my primary position has been as a DBA and during that time I have experienced a lot of problems because of quite simply bad SQL and that badness was resolved using hints, heavy use of parallelism (because more cores = more threads = better performance.....we have all seen this as well) and a desire to ignore what the optimiser has planned because a) the developers can't read execution plans, b) the developers don't understand indexes and statistics and c) don't understand the consequences of putting this all straight when it eventually goes wrong - the job of the DBA. 

    The only advice I give the developers is to use hints only if every other avenue has been exhausted, even if that means going to Microsoft and using the support they pay for.  I don't believe in using hints if they aren't really necessary because I see them as more of a plaster on a festering wound than a solution and I know for a fact that if we relax that rule the promise of "we will use the hint for now and fix the problem later" will never come to fruition.

    I have gone massively off-track but I just wanted to justify why I take this stand the way I do.  The single most valuable asset a company has is data, at least in my opinion, and I just want to be sure that it is treated the way I believe it should be treated.

    Jeff Moden - Monday, October 23, 2017 5:45 PM

    For what it's worth, I absolutely agree that the use of index hints should generally be an absolute last resort and then you should still go back and see what you're missing.

    And sorry for the long winded retort.  Allow me to explain the fervor there.

    I regularly try to dispel supposed "Best Practice" recommendations that are based on hear-say, anecdotes, old-rev lessons that mostly no longer apply (the use of SELECT/INTO is a really good example there), proverbial old wives tales, or just because enough people think it's a cool-kid thing to say.  I've heard the "smarter than Redmond" anecdote from numerous people and, in context and in person (to get the tonal inflection, which is frequently based in technical humor), it's quite easy to understand that the speaker is saying that it's generally a bad idea.  I have no problem with that.

    The problems occur when someone cites the fact that person is an MCM (the best cert there ever was or ever will be) to support an argument for something.  A thousand lurkers will read something like that and take it as absolute gospel.  Word spreads and suddenly we have another "Best Practice" that people, especially the auditors and less informed DBAs that I have to work with, and then I have to go through all sorts of proofs to prove to them that they don't know what they don't know.

    Heh... I know you think that sounds extraordinary but it's happened several times.  SELECT/INTO locking TempDB, Recursive CTEs being good, cursors and while loops being absolutely forbidden, not being allowed to use Ad Hoc Distributed Queries, not being allowed to use OLE Automation Procedures, being forced to use MERGE for joined updates instead of update because someone doesn't actually know how to use it correctly, table hints, index hints, query hints (OPTION (RECOMPILE will save your life for "Minimal Logging), and the king of them all, not being allowed to us xp_CmdShell because of its supposed security risk.  And all of those have been forbidden by folks that don't actually understand what's going on but insist on not using them because they're supposedly a "Best Practice" even if they understand that there can be exceptions to "Best Practices".

    So, every time I see someone cite something someone said and then with no proof as to utility or harm says "Good enough for me" just because of the speaker's supposed or actual pedigree, I go nuts.

    Help stop dis-information and misinformation.  Stop promulgating such anecdotal recommendations.

    @jeff & @kevaburg, I think you're both absolutely correct... I also think you're talking about two different things...

    To kevaburg's point, I work with a database that was built for years by C#.net developers before they ever brought an actual DBA on board and his solution was to add "WITH(NOLOCK)" all of the stored procs (I wish that was hyperbole but it isn't)... And I know firsthand what sys.indexes look like after the CIO saw the indexes that were requested by the DTA and said, "OKAY!"... The majority of that is cleaned up now but I've born witness to the sea of "_dta_index_"...  I've been informed by senior level application developer/manager that the database is just "data persistence" layer for the application, joins are slow and that it makes no sense not to store the data on table the same way it's presented in the application. Plus, the developers who have to be going out of their way to write unnecessarily expensive t-sql... Managing to create execution plans that million of row coming together and merging into billions of rows, all the while artfully dodging any type of filtering until just before the SELECT node... where they pluck out a singe row data... and then loop the whole thing until the screen is fully populated with rows... (Wow! that actually felt really good getting that off my chest...)

    To Jeff's point, Inflexible absolute mandates aren't the answer either... I know of several companies in town that won't allow the use of user defined functions, no distinction between iTVFs and "the other two". 

    In short, given that there this world contains developers who genuinely want to produce the best code they possibly can, one end of the spectrum... and actual saboteurs on the other... Personally, I've taken the position that certain rules have to exist... but with one simple caveat... Provide a valid, repeatable, test script with at least one version of the code that would meet the "best practices" standard... Winner takes all ... period, full stop... Performance should absolutely trump any guideline or standard. 

    That said.. simply asking for a few valid parameter values just to verify that the change isn't taking things in the wrong direction, it becomes immediately obvious that, not only was the code not rigorously tested, it's unlikely that was even executed all.

    Just my 2 cents...

  • I don't have a problem with index hints, even in production, so long as it's presumptive need is evidence based. Hints are declarative, and although the guy taking over your job may not know why you originally used something like an INDEX or force MERGE JOIN hint, they can at least see that it's there and then choose whether or not to keep it going forward. When attempting to influence the optimizer, hints are more transparent and self-documenting than doing weird stuff like adding 1=1 to the WHERE clause or joining on (SELECT 1), in which case the new guy is totally baffled about why you're coding that way, and the syntax hack may not work the same way in the next SQL Server release. Query hints are also more granular than setting global trace flags or server / database options, because such global settings won't effect every query in the same way.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • "Help stop dis-information and misinformation. Stop promulgating such anecdotal recommendations."

    Hi Jeff....I read through my post and am disappointed in myself for doing that!  It won't happen again. 🙂

    Anyway, back to the OP, a recommendation has gone off to Microsoft so lets see what happens.....if anything!

  • kevaburg - Thursday, October 26, 2017 5:33 AM

    "Help stop dis-information and misinformation. Stop promulgating such anecdotal recommendations."

    Hi Jeff....I read through my post and am disappointed in myself for doing that!  It won't happen again. 🙂

    Anyway, back to the OP, a recommendation has gone off to Microsoft so lets see what happens.....if anything!

    Your a good man, kind Sir.  Thanks for understanding what I was talking about.  And apologies for my first response seeming like a body slam.  It certainly wasn't meant to be.  You're definitely one of the good guys.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply