Indexes, Indexes, Indexes

  • Brian O'Leary (7/5/2011)


    doranmackay (7/5/2011)


    For me it was quick to find the root cause.

    So when you have stuff like "where [datetimecolumn] >=getdate()" (if datetimecolumn is indexed)

    the optimizer says "You know what, range scan on date time is so slow jumping between index and record and I have to do it a milliion times, let me rather do a table scan, as there is no jumping between index and record".

    It then avoids the index at all costs cause it would be, well, costly.

    Thats not true. If your nonclustered index is not covering you will still need to do a bookmark or rid lookup (thats the point of covering) regardless of the data type. The descision to perform a range scan by the optimizer is to do with cardinality rather than the use of a datetime data type.

    For example, run this script against the AdventureWorks database and you will see an index seek based on the date filter, however if you change the date from '20040703' to '20040702' the optimzer will choose a clustered index scan.

    Correct, I should have included that the specific indexes existed on a highly transactional database.

    DateTime has low cardinality when only using the date portion,and based on the size of the result set estimated in adventureworks, I would expect successful usage of datetime index since the results from smaller tables is few.

    Do the same test, where you have 1000 unique records being created per hour, and modified, with a couple of joins happening and due to the row count of the tables, a date filter will result in too much records. In those situations, inserts and updates cost because the likelihood of benefiting from a datetime index is zero unless you can specify the datetime down to the millisecond and use where x = '20040703.123456'.

    Catch-all queries done right [/url]
    Gail Shaw's Performance Blog[/url]

  • Good article covering a common situation. Indexes are created because we assume they will be the access path (as pointed out in the DateTime discussion) but because they either aren't covering the entire query result or the result set is too large, we end up with indexes that are never used.

    As to the "It will get slow if we drop these indexes" discussion, my only comment to that is "We can add them back". It's not like they purging data which cannot be recovered.

  • Toby Harman (7/5/2011)


    Good article covering a common situation. Indexes are created because we assume they will be the access path (as pointed out in the DateTime discussion) but because they either aren't covering the entire query result or the result set is too large, we end up with indexes that are never used.

    As to the "It will get slow if we drop these indexes" discussion, my only comment to that is "We can add them back". It's not like they purging data which cannot be recovered.

    Just to humor them you could disable them so that you can "easily" reenable them in the future... they probably don't know the difference anyways :w00t:.

  • Toby Harman (7/5/2011)


    Good article covering a common situation. Indexes are created because we assume they will be the access path (as pointed out in the DateTime discussion) but because they either aren't covering the entire query result or the result set is too large, we end up with indexes that are never used.

    As to the "It will get slow if we drop these indexes" discussion, my only comment to that is "We can add them back". It's not like they purging data which cannot be recovered.

    The moment they are dropped, users will believe the system is running slower. Furthermore any application problems will be initially blamed on the missing indexes.

  • scottm30 (7/5/2011)


    The moment they are dropped, users will believe the system is running slower. Furthermore any application problems will be initially blamed on the missing indexes.

    So don't tell them when you drop them. Do the usual CYA stuff and tell your boss and explain to him exactly why you want to be sneaky!

    Remember Schrödinger's cat and the Observer-expectancy effect! If they don't know they are being observed then they'll behave as normal. They'll still be complaining, but no more than normal.

    Either that or drop the indexes, and the say that you have re-added them after the complaints come but leave them disabled.

    There's more than one way to fool a user!

  • Why is there not more concern being expressed when the query cited by peter-757102 (see page 1) is probably causing incorrect results from this tool? I posted a related query (see page 2) to try to cause others to investigate. I have seen people criticize matters of style with conviction in other threads, but only two of us are mentioning a probable bug here. Am I missing something? If so, please correct me. The explanation by Steven993 seems to fall short.

  • Nice reference to Alice's Restaurant!

  • BW_Toro (7/6/2011)


    Why is there not more concern being expressed when the query cited by peter-757102 (see page 1) is probably causing incorrect results from this tool? I posted a related query (see page 2) to try to cause others to investigate. I have seen people criticize matters of style with conviction in other threads, but only two of us are mentioning a probable bug here. Am I missing something? If so, please correct me. The explanation by Steven993 seems to fall short.

    The query as highlighted by peter-757102 is certainly wrong (of course it could be a copy paste error!). [object_id] is unique within a database, and sys.dm_db_index_usage_stats records index usage for all databases. So a database_id filter is required to produce a correct set.

    However, the effect of this particular error would be to return a index usage row n times (slightly better than it not being returned at all), that doesnt change the fact it's potentially not being used (i.e. wasting disk space, cpu and (memory?) due to higher insert, update and maintenance overhead). On large heavily updated tables this can lead to noticeable performance problems.

    Which leads me to comment on a couple of comments posted that advocate simply removing or disabling them because nobody will notice. Firstly I would say if you try that on a large table with many concurrent users you could come unstuck. Secondly, in my opinion, as DBA's our job is to provide the evidence / science to the business to let them decide (unless of course they have made the decsion to let you decide......often the case) the course of action they want to take. They are the ones who pay for it and use it!

    MCITP SQL Server 2005/2008 DBA/DBD

  • Have to ask whats Alice's Restaurant??

    MCITP SQL Server 2005/2008 DBA/DBD

  • Oh it wasn't intentional!? Google the lyrics to alices restaurant look for the part which is almost verbatim about the pictures with circles and arrows and a paragraph on the back of each one...

    You'll get a laugh!

  • BW_Toro (7/6/2011)


    Why is there not more concern being expressed when the query cited by peter-757102 (see page 1) is probably causing incorrect results from this tool? I posted a related query (see page 2) to try to cause others to investigate. I have seen people criticize matters of style with conviction in other threads, but only two of us are mentioning a probable bug here. Am I missing something? If so, please correct me. The explanation by Steven993 seems to fall short.

    try http://blog.sqlauthority.com/2008/10/03/sql-server-2008-find-if-index-is-being-used-in-database/[/url] by Pinal Dave. It does the same thing.

    This query gave me accurate enough data to start investigating.

    Duplicate indexes should be hit list number one.

    Pinal Dave also has scripts to locate duplicate indexes, and pairs of indexes that are contained by others that would probably be helpful to anyone reading this article.

    On the general idea in other comments that the unused indexes that was created originally should be left alone, I have yet to find any published text which advocates leaving unused non clustered indexes in a transactional table. Something by Microsoft or Oracle saying when we should leave them would be sufficient to sway me.

    Catch-all queries done right [/url]
    Gail Shaw's Performance Blog[/url]

  • I think the query as highlighted is an accurate representation. The effect of the problem is more than simply listing an index more than once, it can list an index more than once where some rows include index usage stats from other indexes. I have not followed through everything in the tool to see how it reacts to the misleading data, but it could potentially result in showing an index as unused when it's not.

  • BW_Toro (7/6/2011)


    I think the query as highlighted is an accurate representation. The effect of the problem is more than simply listing an index more than once, it can list an index more than once where some rows include index usage stats from other indexes. I have not followed through everything in the tool to see how it reacts to the misleading data, but it could potentially result in showing an index as unused when it's not.

    It wouldnt show an index as used when its not, neither would it aggregate them, it simply has the potential to return the same row twice, giving the impression that things are "worse" than they really are. No query known to return inaccurate results can be an accurate representation of anything (I can argue that fact all day :-)).

    EDIT: After re reading the original post, these stats are in fact aggregated once the data has been collected in a central point.

    MCITP SQL Server 2005/2008 DBA/DBD

  • I believe that the query is incorrect and need to have the database filter (AND S.database_id=DB_ID()) added to it. I ran the query both with and without and ended up with different, meaning incorrect results. Will the author of the article please address this issue?

    Thanks,

    MWise

  • Run the query below. It's a copy of the query in the article with two extra columns and an ORDER BY clause. You will very likely see rows where an index name is associated with correct index usage stats, but also index usage stats from a different index for a different table in a different database (even system databases). The extra columns reveal where the other index lives. The OBJECT_NAME function, without a database_id parameter, returns the object from the current database, fooling you into thinking the index usage stats are for one table when they are for another table in another database. I have no idea how the tool knows which set of index usage stats is correct. The tool could be showing an index as unused when it's used, or it could be showing an index as used when it's unused. It depends which row (for a given object and index) the tool uses for the index usage stats.

    select OBJECT_NAME(S.[OBJECT_ID]) AS [OBJECT NAME],

    DB_NAME(S.DATABASE_ID),

    OBJECT_NAME(S.[OBJECT_ID],S.DATABASE_ID),

    I.[NAME] AS [INDEX NAME],

    USER_SEEKS,

    USER_SCANS,

    USER_LOOKUPS,

    USER_UPDATES,

    GetDate()

    FROM SYS.DM_DB_INDEX_USAGE_STATS AS S

    INNER JOIN SYS.INDEXES AS I

    ON I.[OBJECT_ID] = S.[OBJECT_ID]

    AND I.INDEX_ID = S.INDEX_ID

    ORDER BY OBJECT_NAME(S.[OBJECT_ID])

    , I.[NAME]

Viewing 15 posts - 16 through 30 (of 43 total)

You must be logged in to reply to this topic. Login to reply