Automatically Create indexes as needed

  • Comments posted to this topic are about the item Automatically Create indexes as needed

  • While this may have good intentions, I certainly would not have this script create the databases and would not have this as a scheduled job.  Since it uses the missing indexes DMVs, it's recommendations must be taken very cautiously!  It can recommend duplicate indexes, it can recommend indexes that can actually hurt performance in other queries, it will not even recommend the best order for columns within an index.

    See this TechNet article for more detailed information about the limitations of using the missing indexes feature of SQL Server:
    https://technet.microsoft.com/en-us/library/ms345485(v=sql.105).aspx

    The best thing you can do is look at the suggestions, compare them to existing indexes to ensure not a duplicate (sometimes you may just need to add a column to the included columns list instead of a brand new index) look at the columns in the index key and consider the selectivity of each of those columns and how frequently the table is queried by those columns to determine which of the suggested candidate indexes really are good indexes to try in your database.

  • This has been designed to specifically ensure that duplicate indexes never occur if you actually look at the script.  I have run this script for years with very good results that have never hurt performance.  Our web production database has it scheduled to run every day to ensure maximum performance at all times.  I have yet to see any negative impacts from running this script.  For large database in excess of 50gb I recommend restricting it to top 50 indexes found to avoid creating too many indexes.  I would challenge you to show a weakness in the script as I have yet to find one.  I recommend running this adhoc in large databases as needed rather than scheduling it.  It is not intended to fine tune a database but only to address major indexing problems with minimum hands-on work as necessary.  Fine tuning a database is still a manual process.

  • This is not a good idea.

    The missing index feature is a great place to start index tuning, but it will result in a lot of near-duplicate indexes (eg index on col1, col2; index in col1, col2, include col3; index on col1, col2, col4, include col3, col5). It's going to over-index the database resulting in more indexes than needed and hence slower modifications.

    I've had clients who have done things like this, or run DTA and accepted all suggestions (and DTA will generally come up with a smaller set of indexes than missing index DMV, as it considers the workload as a whole, while missing indexes considers queries in isolation), and they generally have a lot of overlapping indexes, overly large indexes (between key and include having most of the table), unused indexes and similar.

    The last time I saw that was a client last month. One table had 12 indexes on it (excluding the clustered index). After looking at all the queries that currently affect that table, I dropped the whole lot and created 3 which support all the queries better than the 12 did.

    This has been designed to specifically ensure that duplicate indexes never occur if you actually look at the script

    How? It's checking on name, but not the columns in the index that I could see.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It actually checks all custom indexes and compares the columns and include columns to ensure that they do not exist in other indexes.  As far as over-indexing goes that is merely something to monitor.  If necessary I also have a script that removes all of the customized indexes and allows it to rebuild the necessary indexing again from scratch.  This is the reason why I recommend running the job adhoc in larger databases.

  • steve 96379 - Wednesday, April 4, 2018 7:07 AM

    This has been designed to specifically ensure that duplicate indexes never occur if you actually look at the script.  I have run this script for years with very good results that have never hurt performance.  Our web production database has it scheduled to run every day to ensure maximum performance at all times.  I have yet to see any negative impacts from running this script.  For large database in excess of 50gb I recommend restricting it to top 50 indexes found to avoid creating too many indexes.  I would challenge you to show a weakness in the script as I have yet to find one.  I recommend running this adhoc in large databases as needed rather than scheduling it.  It is not intended to fine tune a database but only to address major indexing problems with minimum hands-on work as necessary.  Fine tuning a database is still a manual process.

    First, I really appreciate the fact that anyone would share information and try to make someone else's life a bit easier.  You're definitely one of the "good guys".  Thanks for sharing.

    To Chris' point, though, the trouble is that you don't provide those warnings (the ones in your response above) anywhere in your write up or your code.  You also don't warn that the creation of indexes is a wanton duplication of data and that on a huge but narrow table, SQL Server may include enough INCLUDEs to actually duplicate such a huge table.  Neither do you include the notion of a correct FILL FACTOR, which is uber important to INSERTs, "expansive" (which are "expensive") updates, and the horrible page splitting problems that occur when you Reorg or (especially) Rebuild an index (especially with a 100% Fill Factor) during index maintenance. 

    When you post something like this, you have to remember that many people are not prone to reading the discussions that follow.  Normally, they're desperate for a solution and such warnings and issues must be clearly identified right up front and short enough so that folks don't do the "TLDR:" thing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Good points.  I had posted the script to help out and your comments are appreciated.  I'll try to keep that in mind so it is a bit more detailed the next time.  I expected some experience and background but that may not necessarily be the case.

  • Just took a quick look at the code, how does it ensure it isn't "duplicating" indexes that already exist?  Remember, a duplicate index doesn't necessarily mean one that has identical columns in identical order.  Using the DISTINCT may ensure that you don't create duplicate indexes based on the DMV's but I don't see anything that validates those against existing indexes.  The missing index DMV's will show you an index that will satisfy a specific query.

  • steve 96379 - Wednesday, April 4, 2018 7:59 AM

    It actually checks all custom indexes and compares the columns and include columns to ensure that they do not exist in other indexes. 

    Where?

    It's got a distinct on the columns and then a group by on the index create statement, which will  eliminate complete duplicates from being created at the same time, but the missing index feature generally doesn't add complete duplicates to the DMV. It's still able to create near-duplicates (col1, col2 vs col1 include col2) and it doesn't check for already existing duplicates or near duplicates.

    Also, why look at how many scans are on the table that the index is for to see whether to create more indexes on that table, rather than looking at the the estimated improvement % that's in the missing index DMV to determine whether to create the index?  It could end up creating indexes repeatedly on tables that are scanned a lot without looking at what the queries doing the scans are doing. If there are queries with only non-sargable predicates, no amount of indexes created on the tables they scan will stop them from scanning.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It's limited to the custom indexes that are created on purpose.  Assuming that other indexes are by nature already specifically created to address specific performance in the first place so they aren't designed to check those.  I suppose if you wanted to you could check all indexes but I have not had an issue with the delivered indexing so did not put a check on all of them since this seems a bit more than would be necessary on delivered indexing.  Also the scans check is mainly concerned with actual user input rather that any percentage basis on the indexes themselves.  This is so you address actual user performance as opposed to a database performance optimization.

  • steve 96379 - Wednesday, April 4, 2018 8:18 AM

    Good points.  I had posted the script to help out and your comments are appreciated.  I'll try to keep that in mind so it is a bit more detailed the next time.  I expected some experience and background but that may not necessarily be the case.

    Thanks for taking it all the right way.  And your last statement is totally spot on here.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks, Jeff.  One other thing to note.  This job is scripted to not have a scheduled run time specifically because of the issues that I previously talked about.
    Thanks!
    Steve

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply