April 18, 2013 at 5:13 am
Hi all, I have some knowledge about indexes but it's very fuzzy when the queries become more complex, where upon everything I read on the internet seems to contradict itself. For example, if we have the following two tables and two queries run as reports:
People:
PeopleId
FirstName
LastName
Age
Address
Country
Profession
.... some other fields .....
Employee:
EmployeeId
FirstName
LastName
Company
Department
EmploymentStatus
... some other fields....
Query 1:
SELECT P.FirstName, P.LastName, P.Age, P.Address, E.Company, E.Department
FROM People P INNER JOIN Employees E ON (P.LastName = E.LastName AND P.FirstName = E.FirstName)
Where E.EmploymentStatus = 'Employed' AND P.Profession is not null
Query 2:
SELECT P.FirstName, P.LastName, P.Age, P.Address, E.Company, E.Department
FROM People P INNER JOIN Employees E ON (P.LastName = E.LastName)
Where E.EmploymentStatus = 'Employed'
By the way, there are no primary keys or clustered indexes in this query for simplicity sake.
For Query 1, I would write the following two non-clustered indexes for each table (please don't mind the syntax) in order to get an index seek:
CREATE NONCLUSTERED INDEX INDEX_Q1_Employee ON Employee(
LastName ASC,
FirstName ASC,
EmploymentStatus ASC)
INCLUDE (Company, Department)
CREATE NONCLUSTERED INDEX INDEX_Q1_People ON People(
LastName ASC,
FirstName ASC,
Profession ASC)
INCLUDE (Age, Address)
Question 1: Are these indexes correct? From my knowledge, whatever is in the JOIN or WHERE clauses should get placed inside the index and what remains in the SELECT statement should get "covered" and placed with the include portion.
Question 2: The two indexes above are written for Query 1. Can an index seek on both tables be used using these indexes for Query 2? Please note that unlike Query 1, Query 2 only joins on LastName and there is no "P.Profession is not null" in the WHERE clause. I'm curious if the two indexes created cover this query or if I need another set of indexes to get an index seek.
Any help would be greatly appreciated.
April 18, 2013 at 5:21 am
Neither of those indexes are at all useful for the first query, only the one on person is useful for the second.
Index column order matters. It matters a lot. SQL can't do index seeks if the column it needs to seek on is anything other than the leading column.
With the indexes you have there you will likely see two index scans and probably a merge join.
Take a read through these, if you still have questions after, please ask.
http://www.sqlservercentral.com/articles/Indexing/68439/
http://www.sqlservercentral.com/articles/Indexing/68563/
http://www.sqlservercentral.com/articles/Indexing/68636/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 18, 2013 at 5:49 am
@GilaMonster - Thanks for the reply and links. I actually read those links in the past and went through them again (specifically the column ordering part) but I am still confused. Why is my ordering wrong? What would you suggest would be a better index for the 2 queries?
April 18, 2013 at 6:03 am
The point of an index is to reduce the number of rows needed for the query as early as possible. To do that, the index needs to be seekable for the where clause predicates. Yours is not, because the columns used in the where clause are not a left-based subset of the index key. The left-most column in your index is one of the join columns. Hence there is no way to execute that query other than by scanning the index (can't seek based on the where clause) and then probably merge join.
Take a read through this, and note that it's talking about where clause predicates, not join predicates. The where clause is the primary thing you will be looking at for indexing, joins secondary.
http://sqlinthewild.co.za/index.php/2009/01/19/index-columns-selectivity-and-equality-predicates/
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 18, 2013 at 6:20 am
Ok, I think there is where I had the misconception; I did not how how to create an effective index for a query with both a JOIN and WHERE statement, mainly because the examples I've looked were pretty simple. Assuming, WHERE clauses should take precedence over JOINs, would these indexes be better for causing an index seek for Query 1:
CREATE NONCLUSTERED INDEX INDEX_Q1_Employee ON Employee(
EmploymentStatus ASC,
LastName ASC,
FirstName ASC)
INCLUDE (Company, Department)
CREATE NONCLUSTERED INDEX INDEX_Q1_People ON People(
Profession ASC,
LastName ASC,
FirstName ASC)
INCLUDE (Age, Address)
Could they also be applied for Query 2?
Also, the reason why I added the JOIN columns where the WHERE ones was because I thought FROM/JOIN comes before WHERE in the SQL order of operations:
FROM & JOINs
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
April 18, 2013 at 6:34 am
The Comedian (4/18/2013)
CREATE NONCLUSTERED INDEX INDEX_Q1_Employee ON Employee(EmploymentStatus ASC,
LastName ASC,
FirstName ASC)
INCLUDE (Company, Department)
CREATE NONCLUSTERED INDEX INDEX_Q1_People ON People(
Profession ASC,
LastName ASC,
FirstName ASC)
INCLUDE (Age, Address)
Yes, those are much better and they'll work for both queries, though you may want a second index on People just on LastName. Up to you, check the query plans and query execution characteristics and see if a second index is called for.
Also, the reason why I added the JOIN columns where the WHERE ones was because I thought FROM/JOIN comes before WHERE in the SQL order of operations:
That's the logical query processing order, not necessarily how SQL actually runs the queries. If it can, SQL will execute from and where before join. It may in some cases execute the group by before the join, after in other cases depending on what's actually in the group by and so on. Order by can be executed as part of the index read or at the very end of the query depending on indexes and what's in the order by clause.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
April 18, 2013 at 6:42 am
Ok got it. Thanks a lot for your help. 😀
Viewing 7 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply