The purpose of this article is to give you a working knowledge of how to view and understand query execution plans for SQL Server. This is part 1 in a series of articles that will walk you through understanding execution plans to help you improve your queries. Throughout these articles I will use SQL hints to force SQL Server query optimizer to choose the data retrieval methods being explained; the use of hints without careful planning and testing will degrade performance. I will not mention much about hints nor explain their use as this is outside the scope of this article. I plan to explain how to analyze execution plans to help you optimize your queries.
To view the graphical representation for data retrieval methods the optimizer can choose from please go to BOL (Books On Line) and search for execution plan and double click it. For SQL Server 7.0 choose the item titled Graphically Displaying the Execution Plan Using SQL Server Query Analyzer. For SQL Server 2000 choose Execution Plan Options. I recommend you have this available as you read this article. It will give you additional information about the topics I cover in this article.
All sample code used in this article will work when executed in the pubs database. To view execution plans in Query Analyzer you must click on Query and select Show Execution Plan. Now for any query you execute in this window you will get an additional tab in the lower part of Query Analyzer. It will be labeled execution plan. This is where you will find the execution plan for each query you executed in the upper window of Query Analyzer. For convenience I will include diagrams that will show you exactly what I saw while writing this article. Feel free to use them and/or view them in your own Query Analyzer by executing the code given in the examples.
Execute this query in Query Analyzer:
USE pubs SELECT * FROM titles
Now take a look at the execution plan in diagram 1. Above each graphical representation of the execution plan you will find what I’ll call the query header. In the query header you will always find the cost of that query relative to all other queries executed in the batch. This cost will be represented as a percentage. In all the examples in this article one batch will be all the queries executed in a Query Analyzer window. The query header will also show you the query that was executed to generate the execution plan below it.
The graphic portion of the execution plan is always read from right to left and from top to bottom. Thus the beginning of the execution plan is the upper right most icon. The number of branches leading into the top line of the execution plan increase based on the number of tables you include in your query. All plans end with one icon, which is the end result of the query. This will always be found in the upper left corner, just under the query header.
You get an idea of what the query optimizer did to get the result set just by looking at the execution plan. The plan generated for the above query should give you two icons. The one on the right will be the one that fetches all the rows from the table titles. On my screen I see the icon for an index scan. Under this icon I see the table name and the first few characters for the name of the index used. Under this I see how much cost this step took compared with all the other steps in the query. It should show 100% because the above query is so simple and has no conditions that would require a filter step. The left step has an icon that indicates it displayed results in the result window. Under it you can see the word SELECT and that it cost 0% of the query to display the results.
To see exactly how SQL Server got the results it displayed you must move the mouse pointer across each icon and or line. Here is where you will get detailed information about what the optimizer did to give you the result set it displayed. The usefulness of such details increases with the complexity of the query. In diagram 2 you will find the contents of the yellow boxes described below. In this, and future diagrams, I will show all the yellow boxes to be discussed and connected them with a line to the icon or line where you will find them in your Query Analyzer. In your Query Analyzer you will only see one yellow box at a time.
If you just want to know how many rows and the estimated size per row as the query moves from step to step then just move your mouse pointer across the lines; you’ll then see a small yellow box with this info in it. Also, the lines are thicker when more rows are passed from one step to the next. So a thin line indicates 0 or few rows being passed. The thicker the line the more rows that are being passed to the next step and the slower the query will run. So one goal when developing a query is to get the number of rows it handles down to as few as possible as early as possible, preferably within the first or second step of the query. On my screen I see that 18 rows with an estimated row size (total length per row) of 207 was passed from the first step to the second step.
To get more details on what happened and why so many or so few rows were pulled in a particular step you must move the mouse pointer on top of one of the icons.
You will get a lot of information in a large yellow box. This gives you details as to what the query optimizer did to determine how many rows continue to the next step.
Inside The Large Yellow Box
You can refer to the yellow box in diagram 3 while I explain the information found inside the large yellow boxes. At the top of each yellow box you will see the name of the step/operation. When the operation has a different physical and logical operator you will see the physical operator first followed by a slash (/) and then the logical operator. Directly below this name you will see a brief description of the operation. Below that description you will see a list that gives you more information about what was performed in this step.
- Physical operation: This is the operator used by the query optimizer for this step. One example would be a nested loop.
- Logical operation: Is usually the same as the physical operator. When different, this represents a keyword in the query. One example would be INNER JOIN.
- Row count: This is the number of rows fetched by this step and passed to the next step.
- Estimated row size: This is the estimated row size passed on to the next step.
- I/O cost: This is the estimated cost for all I/O activity for a step. The lower this number is the better.
- CPU cost: This is the estimated cost for all CPU activity.
- Number of executes: Is the number of times the step was executed.
- Cost: Is the cost for the step being viewed. The lower this number is the better.
- Subtree cost: Is the cost of the step being viewed and all previous steps.
- Arguments: Here you will see portions of the WHERE clause, parameters, and names of objects (such as tables and indexes) used in the step or operation.
Now that you have a basic understanding of how to read execution plans we will look at a few execution plans. Please note that there is no guarantee you will get the same execution plan as I am describing as the plan is influenced by many factors including, hardware, number of processors, current SQL Server load, number of rows in tables, indexes, etc. I have kept the examples as simple as possible and used hints in an attempt to force the execution plan to be the same as what I describe in my examples. I am providing diagrams that will allow you to see exactly what I saw when I wrote this article.
Example 1: Nested Loop Join
Execute the following query in Query Analyzer:
USE pubs SELECT * FROM titleauthor ta INNER LOOP JOIN titles t ON ta.title_id = t.title_id
Now look at the execution plan in your Query Analyzer or see diagram 4. For convenience I have placed all four yellow boxes in the diagram so they can all be viewed at once.
Start with the upper right icon labeled titleauthor.UPK. You will notice that SQL Server used a clustered index to scan the table. This is similar to a table scan, since the index is the table. You can find the name of the index it used in the arguments section, in this case [UPKCL_taind]. If you look at the table titleauthors (in Enterprise Manager) you can see it has 25 rows so this step pulled all the rows in that table. Looking at the number of executes you can see that it did this with one scan of the table. If you look at the cost section you will notice that this scan cost 88% of the total cost to execute this query.
Next lets look at the lower left icon labeled titles.UPKCL_ti. For this table SQL Server used a clustered index seek to seek specified rows from that table. Again, you can find the name of the index in the arguments section of the yellow box; it is called [UPKCL_titleidind]. Examining the number of executes will show you that SQL Server ran this step 25 times, once for each record it pulled from the table titleauthors. This step took 12% of the total cost to execute this query.
Now lets look at the center icon labeled Nested Loops. You will notice that there is almost no cost associated with this step. This is due to the fact that this step merely told SQL Server how to join the rows and indicated how SQL Server should scan for the rows it needed. The work was done in the previous two scans and then simply merged together in this step. Here is where I will explain the subtree cost. You will notice that for the Nested Loop step the subtree cost is .0430. This is the sum of the cost section in the previous two steps, which are .037685 and .005272. The subtree cost will show you the cost of the current step and all previous steps. In this step you will notice that the physical operation and logical operation are different. This is because the join type is an INNER JOIN (the logical operation) and the join method used is a nested loop. If you were to change the keyword INNER to LEFT in this example and view that execution plan you would see that the physical operation is still a nested loop, however the logical operation has been changed to reflect the use of the keyword LEFT instead of INNER.
The last icon in this execution plan is labeled SELECT. This step cost even less than the previous step. This is because all the data is ready for display and all that is required to complete the query execution is output the results to the bottom screen of Query Analyzer.
Example 2: Hash Match Join
Now we are going to change one word in the query executed in example 1. We will replace LOOP with HASH. The new query will look like this:
USE pubs SELECT * FROM titleauthor ta INNER HASH JOIN titles t ON ta.title_id = t.title_id
You can execute this query in your Query Analyzer or simply refer to diagram 5. We will not go into each icon in as much detail as before since two of them are nearly identical to the ones we examined in example 1.
For the upper right icon you will notice that it is nearly identical to the upper right icon in example 1 except for the cost percentage compared with the other 3 steps in the execution plan. The percentage has dropped to 40% even though the cost is the same. This is because other steps cost more this time due to the change in the query.
For the lower right icon you will notice that it only executes once where as in example 1 it executed 25 times. This time it is a clustered index scan and not a clustered index seek. This is because we changed the physical operation from a nested loop to a hash match loop. You’ll also notice that this step only pulled 18 rows where as in the previous example 25 rows were pulled. Again, specifying the hash match is what has caused the change. This hash match has grabbed the rows it needs from both tables.
Now in the middle icon you will notice that it creates what is called a hash table (comparable with a temp table) and fills it with the results obtained in the upper right step. After doing this, it matches each row in the hash table with a row from the second table to get its result set. This costs much more than the nested loop join used in example 1. This makes it clearer that there are multiple authors for the same title and thus even though only 18 rows were pulled from table titles they matched 25 records in the table titleauthors.
The last step simply displays the result and so has almost no cost associated with it.
In this article I have shown how you can see and read execution plans. Reading these can give you a better understanding of how SQL Server executes the queries you design and how it makes use of indexes. We have examined two simple execution plans to learn how to read them and understand them better. Examining execution plans will help you see areas where a query is doing things you don’t expect. You can then make changes and rerun the query to examine the new execution plan to determine if your changes have helped or not.