Wrong Index Being Used

Question

Wrong Index Being Used

Roger Sabin

Ten Centuries

Points: 1022
More actions
February 7, 2013 at 4:27 pm

#273917

I was wondering if anyone has seen this before and has any idea why it is happening. This is occuring in our production. (All column, table and index names have been changed to protect the guilty.) Problem involves a single select statement on a single table.
First the background information:
We are using SQL Server 2005.
Table has 40,257,935 rows.
Table has 390 columns.
Table has several indexes of which only 2 are pertinent to this discussion.
The table stores a monthy snapshot of account information. The two indexes are as follows:
Name: IDX_MonthlyAccount_YearNum_MonthNum_AltKey_AccountNumber
Indexed Columns: YearNum, MonthNum, AltKey and AccountNumber
Name: IDX_MonthlyAccount_YearNum_MonthNum_AccountNumber
Indexed Columns: YearNum, MonthNum, AccountNumber
If we use the following select statement, the second index (IDX_MonthlyAccount_YearNum_MonthNum_AccountNumber) is correctly used:
select YearNum, MonthNum, AccountNumber from MonthlyAccount where YearNum = 2013 and MonthNum = 1 and AccountNumber = 'ABCDEF'
But, if we use the following select statement, the first index (IDX_MonthlyAccount_YearNum_MonthNum_AltKey_AccountNumber) is incorrectly used:
select YearNum, MonthNum, AccountNumber, Name from MonthlyAccount where YearNum = 2013 and MonthNum = 1 and AccountNumber = 'ABCDEF'
The only change between the two selects is the addition of a column (Name) in the select list that is not part of the second index.
Why is the wrong index being used in the second select?

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 1

Not enough information. If you could post the actual execution plan (saved as a .sqlplan file) for both queries we could probably give you a better answer.

As it is, we no little about the table or the actual definition of the indexes, how much data is being returned, etc.

Roger Sabin Ten Centuries Points: 1022 More actions · Answer 2

Execution plans from our production for both types of select statements

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 3

Your "wrong index" is the clustered index. Since SQL Server needed to go to the table for the additional information not included in the nonclustered index (the "correct index"), SQL Server decided it was cheaper to do an INDEX SEEK on the clustered index rather than an INDEX SEEK on the nonclustered index with a bookmark lookup to the data for the additional information.

Looks to me as if SQL Server used the "correct index" where you believe it used the "wrong index" for the second query.

Brandie Tarvin SSC Guru Points: 172994 More actions · Answer 4

Roger Sabin (2/7/2013)
The two indexes are as follows:
Name: IDX_MonthlyAccount_YearNum_MonthNum_AltKey_AccountNumber
Indexed Columns: YearNum, MonthNum, AltKey and AccountNumber
Name: IDX_MonthlyAccount_YearNum_MonthNum_AccountNumber
Indexed Columns: YearNum, MonthNum, AccountNumber
If we use the following select statement, the second index (IDX_MonthlyAccount_YearNum_MonthNum_AccountNumber) is correctly used:
select YearNum, MonthNum, AccountNumber from MonthlyAccount where YearNum = 2013 and MonthNum = 1 and AccountNumber = 'ABCDEF'
But, if we use the following select statement, the first index (IDX_MonthlyAccount_YearNum_MonthNum_AltKey_AccountNumber) is incorrectly used:
select YearNum, MonthNum, AccountNumber, Name from MonthlyAccount where YearNum = 2013 and MonthNum = 1 and AccountNumber = 'ABCDEF'

Aside from what Lynn is saying, I recently learned that indexes are also chosen based on column order. I do not know why the first index (that has AltKey) is being chosen for the first query, but SQL Server is correctly choosing the second index on the second query because the order of your SELECT follows the order of columns defined in the second index. That's part of how it knows which indexes it can use.

Food for thought. EDIT: Lynn, feel free to correct me if I'm wrong. I'm still figuring out the innards of indexing.

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

mtassin SSC-Insane Points: 23099 More actions · Answer 5

Lynn's spot on with this.

Because the Name field is missing from the non-clustered index it would require a seek on the non-clustered followd by a bookmark lookup to the clustered index for the fields missing from the index. That's more expensive than using the Clustered index once to perform the entire query.

If you want to use the non-clustered index, INCLUDE the name field in the index.

i.e.

DROP INDEX UN_MonthlyCMS.[IDX_MonthlyCMS_ProcessYear_ProcessMonth_VisionAccountNumber_CreditPlanNumber]

CREATE INDEX [IDX_MonthlyCMS_ProcessYear_ProcessMonth_VisionAccountNumber_CreditPlanNumber] ON UN_MonthlyCMS(ProcessYear, ProcessMonth, VisionAccountNumber) INCLUDE (CustomerName)

--Mark Tassin
MCITP - SQL Server DBA
Proud member of the Anti-RBAR alliance.
For help with Performance click this link[/url]
For tips on how to post your problems[/url]

Roger Sabin Ten Centuries Points: 1022 More actions · Answer 6

I understand what's been said.

But I still don't see why the first index is more "correct" when it requires trolling through thousands of records to find just the few for an account when the second index goes directly to the few records for the account.

Doesn't the equality comparison on the AccountNumber column make the second index a better choice than the first index?

Brandie Tarvin SSC Guru Points: 172994 More actions · Answer 7

It might help us a little to see the queries in question.

I'm looking at the pictures generated by the execution plans and I see a difference in the predicate searches that might point to why the clustered index is being chosen over the non-clustered index. The indexes themselves seem to have the same relative cost, but I'm still learning how to interpret the data on these, so I could be wrong.

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 8

Roger Sabin (2/8/2013)
I understand what's been said.
But I still don't see why the first index is more "correct" when it requires trolling through thousands of records to find just the few for an account when the second index goes directly to the few records for the account.
Doesn't the equality comparison on the AccountNumber column make the second index a better choice than the first index?

You are doing an equality comparision on AccountNumber on both indexes. SQL Server is doing a seek on both indexes. For your second query, if SQL Server used the nonclustered index, it would then have to do a bookmark lookup to the table (in this case the clustered index) to get the rest of the data to satisfy the query. Instead it chose to do an index seek on the clustered index as this satisfies the query.

Brandie Tarvin SSC Guru Points: 172994 More actions · Answer 9

Lynn Pettis (2/8/2013)
Roger Sabin (2/8/2013)
I understand what's been said.
But I still don't see why the first index is more "correct" when it requires trolling through thousands of records to find just the few for an account when the second index goes directly to the few records for the account.
Doesn't the equality comparison on the AccountNumber column make the second index a better choice than the first index?
You are doing an equality comparision on AccountNumber on both indexes. SQL Server is doing a seek on both indexes. For your second query, if SQL Server used the nonclustered index, it would then have to do a bookmark lookup to the table (in this case the clustered index) to get the rest of the data to satisfy the query. Instead it chose to do an index seek on the clustered index as this satisfies the query.

Lynn, for my own edification... This is because Name isn't included in either index, right? So the Clustered Index is easier to use to find name as associated with the rest of the data that actually is in the index?

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 10

Brandie Tarvin (2/8/2013)
Lynn Pettis (2/8/2013)
Roger Sabin (2/8/2013)
I understand what's been said.
But I still don't see why the first index is more "correct" when it requires trolling through thousands of records to find just the few for an account when the second index goes directly to the few records for the account.
Doesn't the equality comparison on the AccountNumber column make the second index a better choice than the first index?
You are doing an equality comparision on AccountNumber on both indexes. SQL Server is doing a seek on both indexes. For your second query, if SQL Server used the nonclustered index, it would then have to do a bookmark lookup to the table (in this case the clustered index) to get the rest of the data to satisfy the query. Instead it chose to do an index seek on the clustered index as this satisfies the query.
Lynn, for my own edification... This is because Name isn't included in either index, right? So the Clustered Index is easier to use to find name as associated with the rest of the data that actually is in the index?

Think of it, in this case, as a covering index since all columns not part of the index are also included by definition. Yes, if Name had been a part of the nonclustered index, perhaps as suggested an included column, then SQL Server would have used that index instead. It is much smalleer than the clustered index (fewer columns, more "data" per page, etc).

mtassin SSC-Insane Points: 23099 More actions · Answer 11

Brandie Tarvin (2/8/2013)
Lynn, for my own edification... This is because Name isn't included in either index, right? So the Clustered Index is easier to use to find name as associated with the rest of the data that actually is in the index?

All fields are always part of the clustered index. If you don't specify a field as part of the clustering key, it's still included in the clustered index because the clustered index is the table itself.

--Mark Tassin
MCITP - SQL Server DBA
Proud member of the Anti-RBAR alliance.
For help with Performance click this link[/url]
For tips on how to post your problems[/url]

Roger Sabin Ten Centuries Points: 1022 More actions · Answer 12

So, you are saying that SQL Server determined --

the cost of finding a few records (using second index) + lookup for other fields

was greater than ---

the cost of searching through thousands of records (using first index) to find the few matching that where clause but without a need to do a lookup for the additional fields needed

and that is why the first index was chosen.

I always thought that the index with the greatest number of equality matches (in the order of the columns in the index without any gaps) was always selected. Only when there were range matches or two or more indexes could satisfy the equality matches would it take into account things like lookups.

Brandie Tarvin SSC Guru Points: 172994 More actions · Answer 13

Roger,

If I understand things correctly, it isn't about equality of the Account Number match. It's about you asking the engine for more information than the nonclustered index could easily supply. The engine had a choice, do 2 searches with the nonclustered index or do 1 search with the clustered index.

Since the clustered index is physically sorted by the order of the index-specified columns, the engine knew where to find all relevant information tied to the account number. It didn't have to grab a pointer and then search all the relevant levels of the index for the additional information.

You are correct that the nonclustered index is faster when no other information (save the columns listed in the index) has been requested.

Brandie Tarvin, MCITP Database AdministratorLiveJournal Blog: http://brandietarvin.livejournal.com/[/url]On LinkedIn!, Google+, and Twitter.Freelance Writer: ShadowrunLatchkeys: Nevermore, Latchkeys: The Bootleg War, and Latchkeys: Roscoes in the Night are now available on Nook and Kindle.

mtassin SSC-Insane Points: 23099 More actions · Answer 14

Roger Sabin (2/8/2013)
So, you are saying that SQL Server determined --
the cost of finding a few records (using second index) + lookup for other fields
was greater than ---
the cost of searching through thousands of records (using first index) to find the few matching that where clause but without a need to do a lookup for the additional fields needed
and that is why the first index was chosen.
I always thought that the index with the greatest number of equality matches (in the order of the columns in the index without any gaps) was always selected. Only when there were range matches or two or more indexes could satisfy the equality matches would it take into account things like lookups.

It didn't search through thousands of records, the clustered index was selective enough that it performed a Clustered Index Seek, not a Clustered Index Scan and found the records and didn't then have to perform a second Clustered index seek and then combine the results in order to get the full data. If you want it to use the non-clustered index, make the non-clustered index include the columns, otherwise the clustered index seek operation is more efficient.

--Mark Tassin
MCITP - SQL Server DBA
Proud member of the Anti-RBAR alliance.
For help with Performance click this link[/url]
For tips on how to post your problems[/url]