Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


how to find duplicate indexes in all the tables in a database


how to find duplicate indexes in all the tables in a database

Author
Message
achtro
achtro
Grasshopper
Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)

Group: General Forum Members
Points: 16 Visits: 480
how to find duplicate indexes in a table ,example a nonclustered index names FX_LOCATION_ID is created on columns a,b and again there is a nonclustered index named FX_LOCATION_ID1 on the same columns a,b in a table.Now i want to find the duplicate indexes on all the tables in a database,does anybody has any sql script in handy that i can use.
dwain.c
dwain.c
SSCarpal Tunnel
SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)

Group: General Forum Members
Points: 4275 Visits: 6431
I think that there are probably faster and better ways, but I think this is one way:


;WITH IndexColumns AS (
SELECT DISTINCT a.object_id, a.name, a.type_desc, b.column_id, TABLE_NAME=c.name, [COL NAME]=d.name
FROM sys.indexes a
INNER JOIN sys.index_columns b ON a.object_id = b.object_id AND a.index_id = b.index_id
INNER JOIN sys.tables c ON b.object_id = c.object_id
INNER JOIN sys.columns d ON c.object_id = d.object_id AND b.column_id = d.column_id
WHERE is_hypothetical = 0
),
CombineCols AS (
SELECT object_id, name, type_desc, table_name
,columns=STUFF((
SELECT ',' + [COL NAME]
FROM IndexColumns b
WHERE a.object_id = b.object_id AND
a.name = b.name AND
a.type_desc = b.type_desc AND
a.TABLE_NAME = b.TABLE_NAME
ORDER BY [COL NAME]
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'), 1, 1, '')
FROM IndexColumns a
GROUP BY object_id, name, type_desc, table_name)
SELECT b.type_desc, a.table_name, a.columns, b.name
FROM (
SELECT table_name, columns
FROM CombineCols
GROUP BY table_name, columns
HAVING COUNT(name) > 1) a
INNER JOIN CombineCols b
ON a.table_name = b.table_name AND
a.columns = b.columns
ORDER BY a.table_name, a.columns




It is not particularly fast but give it a chance and see what it gets you.

Edit: Improved the query speed, added remarks below and removed "hypothetical" indexes.

Remarks:
1) If a CLUSTERED index exists on the same fields as a non-CLUSTERED index, both are listed.
2) If two (or more) indexes exist involving the same fields but where the fields are ordered differently, these are also listed.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
achtro
achtro
Grasshopper
Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)Grasshopper (16 reputation)

Group: General Forum Members
Points: 16 Visits: 480
thanks dwain that was very helpful and the query you provided is perfect for my scenario that i mentioned.
PiMané
PiMané
Mr or Mrs. 500
Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)

Group: General Forum Members
Points: 592 Visits: 1334
Hi,

The query doesn't differentiate a "normal" index column from an "include" index column...
In following statements idx_dupIndexes_03 and idx_dupIndexes_06 are considered duplicate but they're not.

CREATE TABLE dupIndexes (id1 INT, id2 INT, id3 INT, id4 INT)
CREATE INDEX idx_dupIndexes_01 ON dupIndexes (id1, id2)
CREATE INDEX idx_dupIndexes_02 ON dupIndexes (id2, id1)
CREATE INDEX idx_dupIndexes_03 ON dupIndexes (id2, id1) INCLUDE (id3)
CREATE INDEX idx_dupIndexes_04 ON dupIndexes (id2, id1) INCLUDE (id3, id4)
CREATE INDEX idx_dupIndexes_05 ON dupIndexes (id2, id1) INCLUDE (id4, id3)
CREATE INDEX idx_dupIndexes_06 ON dupIndexes (id2, id1, id3)



One more question, and in this one I might be saying nonsense...
In the following statements:

CREATE TABLE dupIndexes2 (id1 INT, id2 INT, id3 INT, id4 INT)
CREATE INDEX idx_dupIndexes2_01 ON dupIndexes2 (id1, id2)
CREATE INDEX idx_dupIndexes2_02 ON dupIndexes2 (id1, id2, id3)


should both indexes be considered equal since 02 has all the columns 01 has?
When I run "SELECT id1 FROM dupIndexes2 WHERE id1 = 1 AND id2 = 2" the sql uses idx_dupIndexes2_02 (according to the execution plan).

Thks,
Pedro



If you need to work better, try working less...
dwain.c
dwain.c
SSCarpal Tunnel
SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)

Group: General Forum Members
Points: 4275 Visits: 6431
PiMané (9/17/2012)
Hi,

The query doesn't differentiate a "normal" index column from an "include" index column...
In following statements idx_dupIndexes_03 and idx_dupIndexes_06 are considered duplicate but they're not.

CREATE TABLE dupIndexes (id1 INT, id2 INT, id3 INT, id4 INT)
CREATE INDEX idx_dupIndexes_01 ON dupIndexes (id1, id2)
CREATE INDEX idx_dupIndexes_02 ON dupIndexes (id2, id1)
CREATE INDEX idx_dupIndexes_03 ON dupIndexes (id2, id1) INCLUDE (id3)
CREATE INDEX idx_dupIndexes_04 ON dupIndexes (id2, id1) INCLUDE (id3, id4)
CREATE INDEX idx_dupIndexes_05 ON dupIndexes (id2, id1) INCLUDE (id4, id3)
CREATE INDEX idx_dupIndexes_06 ON dupIndexes (id2, id1, id3)



One more question, and in this one I might be saying nonsense...
In the following statements:

CREATE TABLE dupIndexes2 (id1 INT, id2 INT, id3 INT, id4 INT)
CREATE INDEX idx_dupIndexes2_01 ON dupIndexes2 (id1, id2)
CREATE INDEX idx_dupIndexes2_02 ON dupIndexes2 (id1, id2, id3)


should both indexes be considered equal since 02 has all the columns 01 has?
When I run "SELECT id1 FROM dupIndexes2 WHERE id1 = 1 AND id2 = 2" the sql uses idx_dupIndexes2_02 (according to the execution plan).

Thks,
Pedro


Perhaps I should have noted that the query I provided was just something I threw together.

The two points you mention are probably valid for consideration. I wouldn't want to use the results of the query to simply DROP indexes that the script says are duplicates. A little analysis should be performed on the results to decide what is appropriate to DROP and what not.

Having said that though, the INCLUDEs can probably be resolved. An interesting point that I may look into, as I happen to need this for something else anyway.

I believe in the second case, the two indexes would not be considered duplicates (by my script). I believe I would call that a case of "overlapping indexes" rather than identical ones. A case that could probably be handled if needed.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
dwain.c
dwain.c
SSCarpal Tunnel
SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)

Group: General Forum Members
Points: 4275 Visits: 6431
achtro (9/17/2012)
thanks dwain that was very helpful and the query you provided is perfect for my scenario that i mentioned.


You are most welcome. As I said above, it turns out I needed something like this anyway. So the effort has served both of our purposes. :-)


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
PiMané
PiMané
Mr or Mrs. 500
Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)Mr or Mrs. 500 (592 reputation)

Group: General Forum Members
Points: 592 Visits: 1334
dwain.c (9/17/2012)
Perhaps I should have noted that the query I provided was just something I threw together.

The two points you mention are probably valid for consideration. I wouldn't want to use the results of the query to simply DROP indexes that the script says are duplicates. A little analysis should be performed on the results to decide what is appropriate to DROP and what not.

Having said that though, the INCLUDEs can probably be resolved. An interesting point that I may look into, as I happen to need this for something else anyway.


Hi,
By no means I meant to question your query.. just wanted to alert to the fact that the include columns aren't considered as include but as normal columns.
I took the liberty to change the query so it does that.

WITH IndexColumns AS (
SELECT DISTINCT a.object_id, a.name, a.type_desc, b.column_id, TABLE_NAME=c.name, [COL NAME]=d.name, b.is_included_column
FROM sys.indexes a
INNER JOIN sys.index_columns b ON a.object_id = b.object_id AND a.index_id = b.index_id
INNER JOIN sys.tables c ON b.object_id = c.object_id
INNER JOIN sys.columns d ON c.object_id = d.object_id AND b.column_id = d.column_id
WHERE is_hypothetical = 0
),
CombineCols AS (
SELECT object_id, name, type_desc, table_name
,columns=STUFF((
SELECT ',' + [COL NAME]
FROM IndexColumns b
WHERE a.object_id = b.object_id AND
a.name = b.name AND
a.type_desc = b.type_desc AND
a.TABLE_NAME = b.TABLE_NAME AND b.is_included_column = 0
ORDER BY [COL NAME]
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'), 1, 1, '')
,include_columns=ISNULL(STUFF((
SELECT ',' + [COL NAME]
FROM IndexColumns b
WHERE a.object_id = b.object_id AND
a.name = b.name AND
a.type_desc = b.type_desc AND
a.TABLE_NAME = b.TABLE_NAME AND b.is_included_column = 1
ORDER BY [COL NAME]
FOR XML PATH(''), TYPE).value('.', 'VARCHAR(MAX)'), 1, 1, ''), '')
FROM IndexColumns a
GROUP BY object_id, name, type_desc, table_name)
SELECT b.type_desc, a.table_name, a.columns, a.include_columns, b.name
FROM (
SELECT table_name, columns, include_columns
FROM CombineCols
GROUP BY table_name, columns, include_columns
HAVING COUNT(name) > 1) a
INNER JOIN CombineCols b
ON a.table_name = b.table_name AND
a.columns = b.columns AND
a.include_columns = b.include_columns
ORDER BY a.table_name, a.columns




dwain.c (9/17/2012)

I believe in the second case, the two indexes would not be considered duplicates (by my script). I believe I would call that a case of "overlapping indexes" rather than identical ones. A case that could probably be handled if needed.


They are not considered duplicates, my question is completely out of this "subject"... A table with an index with (col1, col2) and another index with (col1, col2, col3) according to the execution plan always uses the 2nd index... Shouldn't "sub-indexes" be considered also duplicates?!

Thanks,
Pedro



If you need to work better, try working less...
dwain.c
dwain.c
SSCarpal Tunnel
SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)

Group: General Forum Members
Points: 4275 Visits: 6431
Nice work on the query mod. Like I said, I figured that info must be somewhere in those tables I SELECTed against.

On your second point, I guess it depends on what your requirements are. If the query plan always chooses one over the other, it would make sense to at least mark these some way so they can be identified and analyzed later. I can't confirm whether that is always the case.


My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
My temporal SQL musings: Calendar Tables, an Easter SQL, Time Slots and Self-maintaining, Contiguous Effective Dates in Temporal Tables
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search