The Ultimate Missing Index Finder

  • Great util!! One little improvement I made... I added this to the front of the SELECT (just before the first column) to make the create statements, that way you could just copy/paste the results of that into a new qry to run the create statements you want.

    'CREATE NONCLUSTERED INDEX IX1_' + object_name(sys.dm_db_missing_index_details.object_id)

    + left(cast(newid() as varchar(max)),5) + char(10) + 'on [dbo].'

    + object_name(sys.dm_db_missing_index_details.object_id) + ' (' +

    case

    when sys.dm_db_missing_index_details.equality_columns is not null

    and sys.dm_db_missing_index_details.inequality_columns is not null

    then sys.dm_db_missing_index_details.equality_columns

    + ',' + sys.dm_db_missing_index_details.inequality_columns

    when sys.dm_db_missing_index_details.equality_columns is not null

    and sys.dm_db_missing_index_details.inequality_columns is null

    then sys.dm_db_missing_index_details.equality_columns

    when sys.dm_db_missing_index_details.inequality_columns is not null

    then sys.dm_db_missing_index_details.inequality_columns

    end

    + ')' + char(10)

    +

    case

    when sys.dm_db_missing_index_details.included_columns is not null

    then 'Include (' + sys.dm_db_missing_index_details.included_columns + ')'

    else ''

    end as CreateIndexStmt

    --Vincent yovincent@hotmail.com

    EDIT:

    The missing index DMVs this qry is based on often report missing indexes that, if created, would have too many columns... Imagine the performance impact when you go to do an update or insert on a table with a 50 column index!! Therefore, I also added this to the end of the WHERE clause to keep the qry from returning results with a ton of included columns.

    AND isnull(Len(included_columns)-Len(Replace(included_columns,',',''))+1,1)<10

    --Vincent yovincent@hotmail.com

  • The missing index DMVs this qry is based on often report missing indexes that, if created, would have too many columns... Imagine the performance impact when you go to do an update or insert on a table with a 50 column index!! Therefore, I also added this to the end of the WHERE clause to keep the qry from returning results with a ton of included columns.

    You also get a few entries where the long list of includes is only differing by one or has one or two additional or missing include columns. That is why it is not a good idea to blindly copy and paste pre-generated index create statements (I wouldn't do this with DTA either), but rather to merge whenever and as much as possible. One want one key and two includes, another want this one key and one other, but no includes. These can be merged.

  • It's probably because you created an index against the AnalysisHolding table when the result was complaining about the Notice table.

    Scott Herbert (10/9/2008)


    This is a great proc; I've been using something similar for a while, but this adds in a few more useful columns. One thing that's baffling me however is that some of the highest results from this proc I've subsequently added indexes for, and they still turn up (unlike most instances where as soon as the index is added, they disappear from the proc results).

    For example, from the proc results:

    schema name = Analysis

    Table name = Notice

    Equality = NULL

    Inequality = [NoticeStageID]

    Include = [ID], [AnalysisID], [FileID]

    Score = 25953.496496

    I've then added the index

    CREATE INDEX [ixAnalysisHolding_AnalysisID_CustodianBeneficialOwnerID]

    ON [Analysis].[AnalysisHolding]

    ([AnalysisID], [CustodianBeneficialOwnerID])

    INCLUDE([ID], [RegisteredHolderBeneficialOwnerID], [BeneficialOwnerInvestorID], [Shares])

    WITH ( FILLFACTOR = 90 );

    and no difference to the proc results. What's going on? Am I missing the significance of the "Inequality" column, or is this a factor of the way the DMVs work?

  • First off, I just wanted to say great job on these scripts, I am using them to tune my database at the moment and was hoping for some of your expert advice on trying to create as few indexes as possible from the overload of information SQL Server generates from the DMVs.

    Take this sample suggestion for a missing index for example (sorry it's an image, this seems to be the best way to keep the formatting):

    The equality columns are a no brainer. Now, considering the inequality columns and the included columns, what would the best way to have these 2 suggestions be handled best by 1 index? From what I understand, the order matters for the seek columns (equality and inequality), but not for the included columns. So is it even possible for both of these to be covered well by 1 index?

    Also, if I have lots of index suggestions with the same equality and inequality fields but different included fields, is it better to include more fields or go with less included fields? Again, the goal is to create 1 index vs. 3 (if 3 have different included columns).

    Any help or guidance here you can provide would be greatly appreciated. I am trying to avoid having index overload and trying to get a handle on exactly the best way to proceed here. Thanks again for your wonderful scripts, they're a big help!

    YeshuaAgapao (11/26/2008)


    The missing index DMVs this qry is based on often report missing indexes that, if created, would have too many columns... Imagine the performance impact when you go to do an update or insert on a table with a 50 column index!! Therefore, I also added this to the end of the WHERE clause to keep the qry from returning results with a ton of included columns.

    You also get a few entries where the long list of includes is only differing by one or has one or two additional or missing include columns. That is why it is not a good idea to blindly copy and paste pre-generated index create statements (I wouldn't do this with DTA either), but rather to merge whenever and as much as possible. One want one key and two includes, another want this one key and one other, but no includes. These can be merged.

  • I would do (FranchiseID, AppointmentTypeID, ApptDateTime) INCLUDE (CustomerID)

    This is because only one inequality column ( =,<=, IN(), BETWEEN) can be range-scanned. If you wanted to follow BOL documentation for the missing index DMVs, which tells you to put all equality and all inequality columns as seeks, then I would put CustomerID at the end of the seek columns - (FranchiseID, AppointmentTypeID, ApptDateTime, CustomerID).

    If you got one of those heavy read/write tables and it is being suggested that you need to include the world, then I wouldn't. Go in the code and look for sloppy queries to clean up (SELECT * in particular, but they can also list 20 columns and then use 5). If you can't change the code, then just cover the WHERE clause and JOIN predicates (put non-range-scannable inequality predicates like OR or <> in as includes) so at least all the filtering can be done prior to the bookmark lookup. If the 'include the world' is spread out over multiple results and the higher offending results have just a few, then just include that just a few. Make sure you do have coverage for every WHERE and JOIN predicate though.

  • don kitchen (1/15/2009)


    It's probably because you created an index against the AnalysisHolding table when the result was complaining about the Notice table.

    Yep, that's embarrassing. 😛 Copy and paste error from two examples of the same problem. The index I meant to post was

    CREATE NONCLUSTERED INDEX [IX_Notice_NoticeStageID] ON [Analysis].[Notice]

    (

    [NoticeStageID] ASC

    )

    INCLUDE ( [ID],

    [AnalysisID],

    [FileID])

  • Could you give a quick rundown on how to interpret the results properly. I am not entirely clear on how to read these the best way.

    Thanks!

  • /*1)Below query is to Findout the Missing Indexes on the tables.

    This will generate the Create Index script based on the queries that are executed on the server by checking the columns

    used in equal/inequal /where clause. */

    SELECT statement, migroupsstats.avg_total_user_cost as AvgTotalUserCostThatCouldbeReduced, migroupsstats.avg_user_impact as AvgPercentageBenefit,

    'CREATE INDEX missing_IX_' + CONVERT (varchar, object_name(mid.object_id))

    + CASE WHEN mid.equality_columns IS NOT NULL THEN '_' ELSE '' END

    + ISNULL (replace(replace(replace(mid.equality_columns,', ','_'),']',''),'[',''),'')

    + CASE WHEN mid.inequality_columns IS NOT NULL THEN '_' ELSE '' END

    + ISNULL (replace(replace(replace(mid.inequality_columns,', ','_'),']',''),'[',''), '')

    + ' ON ' + mid.statement

    + ' (' + ISNULL (mid.equality_columns,'')

    + CASE WHEN mid.equality_columns IS NOT NULL AND mid.inequality_columns IS NOT NULL THEN ',' ELSE '' END

    + ISNULL (mid.inequality_columns, '')+ ')'

    + CASE WHEN mid.included_columns IS NOT NULL THEN + ISNULL (' INCLUDE (' + mid.included_columns + ')', '') ELSE

    '' END AS create_index_statement

    FROM sys.dm_db_missing_index_groups migroups

    INNER JOIN sys.dm_db_missing_index_group_stats migroupsstats ON migroupsstats.group_handle = migroups.index_group_handle

    INNER JOIN sys.dm_db_missing_index_details mid ON migroups.index_handle = mid.index_handle

    order by 1

    Srihari Nandamuri

    Comments please....

    Srihari Nandamuri

  • I'm afraid all the results in your query is indexes that already exist in my database.Am I missing something? I'll have to go through your query step by step to see what exactly it is that it does but time is tight just now.:w00t:

    :-PManie Verster
    Developer
    Johannesburg
    South Africa

    I can do all things through Christ who strengthens me. - Holy Bible
    I am a man of fixed and unbending principles, the first of which is to be flexible at all times. - Everett Mckinley Dirkson (Well, I am trying. - Manie Verster)

  • Thank you, thank you, thank you. I used this script and just became a hero at work. :w00t:

  • This just peeks into the dynamic management views and does the joins for you, possibly replaces the comma delimiters with another of you chossing (if you pick something other than the default comma) for csv export, computes a ranking score, and sorts it. The indexes that come out will match sys.dm_db_missing_index_details . The numbers come from sys.dm_db_missing_index_group_stats . The query optimizer puts the missing indexes in when it finds that it has to scan the table. I don't know what causes them to get taken out.

    resetting the missing index DMVs 'may' fix the problem of the false positives, if they are not being taken out correctly. Otherwise I don't know what will get around the problem.

    DBCC SQLPERF('sys.dm_os_wait_stats',CLEAR) appears to work

    but..

    DBCC SQLPERF('sys.dm_db_missing_index_details',CLEAR);

    DBCC SQLPERF('sys.dm_db_missing_index_groups',CLEAR);

    DBCC SQLPERF('sys.dm_db_missing_index_group_stats',CLEAR);

    Does ot appear to work.

    The only way to reset the missing_index DMVs appears to be restarting the SQL Server instance.

    Here are other limitations of the missing index DMVs from BOL (The 500-limit, and the non-coverage of trivial plans - queries without joins - are the biggies)

    *It is not intended to fine tune an indexing configuration. (Only good for quick and dirty tuning with a small timeframe, though it spots terrible offenders pretty good)

    *It cannot gather statistics for more than 500 missing index groups. (Requires a DMV reset; only by server restart; to get around)

    *It does not specify an order for columns to be used in an index. (You can derive this a little by merging entries)

    *For queries involving only inequality predicates, it returns less accurate cost information.

    *It reports only include columns for some queries, so index key columns must be manually selected.

    *It returns only raw information about columns on which indexes might be missing.

    *It does not suggest filtered indexes.

    *It can return different costs for the same missing index group that appears multiple times in XML Showplans.

    *It does not consider trivial query plans.

    These are the 3 main ways I use to tune indexes from quickest/least accurate to most time consuming/most accurate; accuracy affects primarily include column selection and effectiveness of composite indexes):

    *The missing index DMVs (Util_Missing Indexes)

    *Analyze query plans of SQL profiler trace worst offender aggregate reports

    *Trace and crawl through the application, weighing by user-traffic if possible to know.

    Another think that can cause an otherwise perfect index to get ingnored, especially in databases without foreign key constraints (at least in the Dev/QA environments): incompatible types in JOIN and WHERE predicates. If you provide a int predicate for a varchar column (where the varchar values is all int-castable), it will want to cast the varchar to int, wrecking any possibility of index usage. Casting your predicates will get around it for a quick-fix, but having your FK-links be identical in type (as required by FK constraints) will eliminate that problem, at least for JOINs. I don't recall this scenario adding missing index entries and creating 'false postives' But it is something that the missing index DMVs won't find.

  • After a year of using this in production (see my post in Nov. 2008) and comparing results with the Performance Dashboard report on missing indexes, I don't see any advantage to this proc over the dashboard report, nor do I really understand why they often get different results.

  • Is there a way to modify the procedure to be outside the database, perhapse another argument to select the database. I have a admin database that I store reports and procedures in and then run those against the databases I need results for. This keeps custom objects outside of applications where it would sometimes void warranty and support agreements.

  • Hi folks

    I just came across this script and one thing that puzzled me was this piece of code:

    (CONVERT(Numeric(19,6), sys.dm_db_missing_index_group_stats.user_seeks)+CONVERT(Numeric(19,6), sys.dm_db_missing_index_group_stats.unique_compiles))*CONVERT(Numeric(19,6), sys.dm_db_missing_index_group_stats.avg_total_user_cost)*CONVERT(Numeric(19,6), sys.dm_db_missing_index_group_stats.avg_user_impact/100.0) AS Score

    I usually saw the seeks added up with the scans, not with the unique compiles. Is this intended like this, is it a typo (+ instead of *) or it should have been user_scans instead of unique_compiles ?

    Thanks,

    Alin

  • Thanks for the script.

Viewing 15 posts - 16 through 29 (of 29 total)

You must be logged in to reply to this topic. Login to reply