Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Don’t Make Dumb Mistakes by Duplicating Indexes

Just today, I was reviewing the indexing of a database used by a third-party application. As part of my review, I ran a script against it that is used to identify duplicate indexes. The script I used was written Paul Nielsen, the author of the SQL Server Bible; and Itzik Ben-Gan. The script is available here.

After running Paul’s script against the database to find duplicate indexes, I discovered that almost every table had a duplicate index. As I delved into this a little closer, I discovered what the designer of this database had done.

First, it looked like he created primary keys using non-clustered indexes on an identity column for most of the tables in his database. Second, he then added a clustered index on the same identity column for each of these same tables. By doing so, he ended up having two identical indexes on the same identity column, one clustered and one non-clustered. Instead, the designer should have created his primary keys using clustered indexes instead of using non-clustered indexes. This would give him the same end effect, but requiring only a single index, not two. Obviously, by having duplicate indexes on most of his tables, he was causing SQL Server a lot of additional work to maintain extra indexes that were not needed.

If you have never checked your databases for duplicate indexes before, you should do so, as you might find duplicate indexes that can be removed, helping to boost the performance of your existing databases.

 

Share this post :

Comments

Posted by Alex Rosa on 16 April 2009

Great advice.

Some years ago I faced a similar case during a consulting work.

Fortunately there were around 15 duplicated indexes, but unfortunately some duplicate indexes were kept, because the developer wrote some TSQL inside the application using Index Hints.

So, if we drop the indexes some parts of the application will break down.

In this case I just left a recommendation to remove the Hints and the duplicated indexes.

Posted by David Bird on 17 April 2009

Besides checking for duplicate indexes, I would also recommend checking for duplicate foreign keys.

Posted by Steve Jones on 17 April 2009

I've seen this before as well. This is where developers get a really bad name from indexing. That and creating a separate index for every single column.

Posted by Raju Lalvani on 20 April 2009

We should also take care that all the referenced columns in foreign keys have indexes, many a time these are missing.

We should also check if the leftmost columns of indexes are same

Posted by hari_akc on 20 April 2009

usefull informations

Posted by Kurt W. Zimmerman on 20 April 2009

It is amazing how sloppy developers get when it comes to this aspect of the database.  I have spent countless hours cleaning up messes from so-called "Senior" DBAs removing duplicate indexes & FKs.  In doing so I've seen performance boosts in just about all cases.  

I have to laugh novice developers think it "magic"... lol

Just one of the responsibilities of a "good" DBA... :)

Kurt

plain ordinary DBA from the school of hard knocks...

Tarrytown, NY

Posted by TimothyAWiseman on 20 April 2009

This is an excellent post with a good point, but I would be cautious with this statement:

Instead, the designer should have created his primary keys using clustered indexes instead of using non-clustered indexes. This would give him the same end effect, but requiring only a single index, not two.

Clustered and nonclustered indexes are processed differently and there are time when one is preferred over the other.  While for most cases the clustered index is more effecient, there are some instances (particularly aggregates and particularly count) where SQL Server will process a nonclustered index much faster.

Posted by JJ B on 20 April 2009

I appreciate you blogging about this as it never occurred to me that duplicate indexes could be a problem.  I didn't think I would have any duplicates in any of my databases, but I was curious to see if I missed something.  So, I ran Paul's first query on a database with almost 300 tables.  I found 24 duplicates!  Yikes.  What was going on?

I haven't done a lot of research yet, but I found one cause that I can now work to remedy.  I use ERwin to generate the schema for my databases.  I have it set up so that by default, ERwin generates indexes for foreign keys.  Also by default, the primary keys are created as clustered indexes.  Guess what happens when a table's primary key is also a foreign key.  Yep, both get created.

I looked further into ERwin and see that I can keep the defaults, but have control at an index level to choose not to generate certain indexes.  So, I can go back in and fine-tune my database so that when I re-create it in the future, it will create a more efficient database.  Also, I now know about something else I must think about when I generate new tables and databases.  

Thanks for sharing this issue and query to help fix it!!!  It is not always a "dumb" mistake.  Sometimes it is just a matter of needing to be made aware of how things work.  It's a matter of needing to be educated.  I appreciate your contribution to my education.

Posted by SQL Noob on 20 April 2009

one time a dev gave us a script for a new table. we asked where we should put indexes. the reply we got was one index containing every column in the table

Posted by jim Etheridge on 20 April 2009

I created a utility script that writes missing index creation scripts, as identified by the system dynamic maintenance views (DMV) sys.dm_db_missing_index_details, sys.dm_db_missing_index_groups, and sys.dm_db_missing_index_group_stats.

The problem was that I discovered the DMVs often say an index is "missing" when it already exists. I don't know why. So I then wrote another utility to find the duplicates. This has saved me on several occasions.

I still have a handful of duplicates, as required by the vendor that created the database, because of the clustered/nonclustered issue from above, and specific references to index names in the application code.

But I now know where they are.

Posted by rlobbe on 21 April 2009

You can actually get a performance GAIN by having your PrimaryKey as a non-clustered index, and duplicating it as a clustered index. It comes down the the number of 'index entries' will fit in each page. If a query only accesses columns contained in the key it is FAR MORE efficient to access a non-clustered index than a clustered one (which also contains the rest of the tables data)

In the case of a Identity PK - it can make a great difference if other tables have ForeignKeys referencing it.

Not so dumb after all...

Your table usage and growth profiles should also be taken into account to see if you do gain from this.

Posted by Brad M. McGehee on 21 April 2009

In regard to TimothyAWiseman's and rlobbe's comments, there are always exceptions to every rule.

One exception I can think of (and thanks for Paul Nielson for pointing this out), is that a duplicate index might be helpful in some cases where a duplicate non-clustered index is much narrower than the clustered index (using the same key). In this case, it might be more efficient for SQL Server to use the narrower index than the wide clustered index. In a sense, the narrow non-clustered index is acting as a covering index for the wide clustered index. Of course, don't assume that this will always be the case. You will want to test this assumption before you implement it. And don't forget that maintaining two duplicate indexes increases index maintenance costs. You must maintain a balance between having to few and too many indexes. If a table is subject to a lot of data modification, but the queries that benefit from using a duplicate index run seldomly, then the additional cost of the duplicate index might not be worth the higher maintenance costs.

In the particular database I was analyzing, this was not the case though.

Below are some additional articles on this topic:

sqlblog.com/.../clustered-or-nonclustered-why-not-both.aspx

windowsitpro.com/.../in-search-of-duplicate-indexes-on-your-tables.html

blogs.msdn.com/.../detecting-overlapping-indexes-in-sql-server-2005.aspx

connect.microsoft.com/.../ViewFeedback.aspx

blogs.sqlserver.org.au/.../1233.aspx

www.mssqltips.com/tip.asp

sqlserverinternals.blogspot.com/.../finding-duplicate-indexes_23.html

Posted by Charles Kincaid on 22 April 2009

Read what Phil Factor and others have said about clustered indexes.  In short SQL Server builds a linked list of data pages for your table.  On top of that is the index to the pages.  The boost you see is from the server getting all of the data from the index and never having to hit the actual rows.

It's a pretty narrow hole to drive your camel through.  In GENERAL having a duplicate index is not a great thing.  In some specific instances doing what may seem like the wrong thing might turn out good.

Posted by c.masca on 3 May 2009

Another case where a duplicate clustered index is necessary is when there is need for are several uniquely-indexed views based on the same data, such as select a,b from view_facility1 and select a,b from view_facility2.

The views are used a pseudo-tables in an MS access front-end that needs to hide facility2 data from facility1 users.

Leave a Comment

Please register or log in to leave a comment.