http://www.sqlservercentral.com/blogs/stratesql/2011/02/11/can-you-dig-it_3F00_-_1320_-find-estimated-cost/

Printed 2014/10/25 10:37PM

Can You Dig It? – Find Estimated Cost

By StrateSQL, 2011/02/11

3295969599_eb16a58118_m1
She can dig it!
D Sharon Pruitt

It’s the eighth post in the plan cache series.  I hadn’t expect to take as many days off from this due to the 31 Days of SSIS, but that one had a stricter time table that I was following.  Though now that I’ve wrapped up that series I want to get back to talking about things you can pull out of, and discover in, the plan cache.

Today we’re going to do a little more digging into the StmtSimple element.  As I mentioned in a previous post, there are some interesting attributes that could provide some insight in this element.  Today, let’s look at StatementSubTreeCost; which is similar in many respected to StatementEstRows.

StatementSubTreeCost Attribute

As I mentioned, we will be looking at the attribute StatementSubTreeCost.  This attribute returns the estimated cost that a calculated when the plan was compiled.  The cost is based on the statistics and distributions of data that the query will encounter when executing.   The higher the cost, the more expensive the query will be against the resources of your environment.

There aren’t any hard guidelines, that I know of, that indicate specifically when the cost of a query is too high or what the average cost of your queries should be.  Costs are based on the statistics for your data and derived from the manner in which an execution plan is put together.

With the consulting work that I do, I often don’t blink when the cost estimation is below one for queries that are often executed.  If the query runs often, though, and the cost is in the thousands or millions, I’ll wonder if the query is really doing what it should or if it’s making the request properly.

StatementSubTreeCost Query

Below is a query that I’ve used a few times for investigating StatementSubTreeCost information from plan caches.  You’ll notice that it’s structure is a bit different from the one provided for querying for StatementEstRows.  The reason for this difference is that I found on one SQL Server the query ran long.  The version below pulls by StatementSubTreeCostand does so more reasonably:


IF OBJECT_ID('tempdb..#StatementSubTreeCost') IS NOT NULL
DROP TABLE #StatementSubTreeCost ;

CREATE TABLE #StatementSubTreeCost
(
StatementSubTreeCost FLOAT
,StatementId INT
,UseCounts BIGINT
,plan_handle VARBINARY(64)
) ;

WITH XMLNAMESPACES
(DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
INSERT INTO #StatementSubTreeCost (StatementSubTreeCost, StatementId, UseCounts, plan_handle)
SELECT TOP 25
c.value('@StatementSubTreeCost', 'float') AS StatementSubTreeCost
,c.value('@StatementId', 'float') AS StatementId
,cp.UseCounts
,cp.plan_handle
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) AS qp
CROSS APPLY qp.query_plan.nodes('//StmtSimple') t(c)
WHERE cp.cacheobjtype = 'Compiled Plan'
AND qp.query_plan.exist('//StmtSimple') = 1
AND c.value('@StatementSubTreeCost', 'float') IS NOT NULL
ORDER BY c.value('@StatementSubTreeCost', 'float') DESC;

WITH XMLNAMESPACES
(DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
,cQueryStats
AS (
SELECT query_hash
,SUM(total_worker_time / NULLIF(qs.execution_count,0)) AS avg_worker_time
,SUM(total_logical_reads / NULLIF(qs.execution_count,0)) AS avg_logical_reads
,SUM(total_elapsed_time / NULLIF(qs.execution_count,0)) AS avg_elapsed_time
FROM sys.dm_exec_query_stats qs
GROUP BY query_hash
)
SELECT
s.StatementSubTreeCost
, s.usecounts
, CAST(c.value('@StatementEstRows', 'float') AS bigint) AS StatementEstRows
, qs.avg_worker_time
, qs.avg_logical_reads
, qs.avg_elapsed_time
,c.value('@StatementType', 'varchar(255)') AS StatementType
,c.value('@StatementText', 'varchar(max)') AS StatementText
,s.plan_handle
,qp.query_plan
FROM #StatementSubTreeCost s
CROSS APPLY sys.dm_exec_query_plan(s.plan_handle) AS qp
CROSS APPLY qp.query_plan.nodes('//StmtSimple') t(c)
LEFT OUTER JOIN cQueryStats qs ON c.value('xs:hexBinary(substring(@QueryHash,3))','binary(8)') = query_hash
WHERE c.value('@StatementId', 'float') = s.StatementId
ORDER BY c.value('@StatementSubTreeCost', 'float') DESC, s.StatementSubTreeCost DESC

The query returns the following columns:

The Query Results

Similar to the previous post on this element, let’s take a look at the results from a couple SQL Servers.  There are a few things that do pop out to me on these servers that lead me to want to investigate a few things.  It’ll be obvious below, but I’m going to talk about these stats and then the related systems, but I won’t be providing information on those investigations.  I left that out for brevity – but what I provide should provide direction on what you can look for with your own results.

Result Set from Server A

image

There isn’t anything excessively critical in this list.  Though, I wouldn’t be surprised to see a noticeable performance hit on the data returned by the blue item if it ran during peak hours.  But with today being Friday, digging into a couple of these is a good way to wrap up the weekend.

Result Set from Server B

Now lets take a look at another SQL Server to see how the results can differ and possibly provide us with a different path to follow.

image

Based on the items I pointed out, the maintenance processes need to have a little review.  They are heavy handed and likely (actually are) causing side effects that are unneeded.  The upside to fixing these is that the results above will likely shift dramatically based on a little bit of work.

StatementSubTreeCost Wrap-up

Looking into the plan cache for StatementSubTreeCost is a little better than looking for StatementEstRows.  Where the later could be a justifiable row count, a cost that reaches the millions is often one that you can say is – just bad.  When you look at lower costs, you will need to make choices between the cost and frequency of execution to determine if you should investigate.  The key is identify worthwhile places to look to be able to affect improvements in your environments, looking at the plan cache in this manner does just that.

Also, the information included above from sys.dm_exec_query_stats should be considered to be generalized performance information.  This is because the DMV returns information based on individual query statements which may or may not be part of the average execution of a plan or statement.

One final bit of caution, when querying the plan cache be careful not to start your queries and walk off.  The plan cache is pretty active on most environments and writing XQuery against it can sometimes lead to lengthy execution times.  Don’t make the mistake of firing off your query and causing your own side-effect issues.  Not saying I did that once or that doing that caused me to do some re-writing of the query above.

Downloads

Related posts:

  1. Can You Dig It? – Find Estimated RowCounts
  2. Can You Dig It? – StmtSimple Element
  3. Can You Dig It? – The Plan Cache


Copyright © 2002-2014 Simple Talk Publishing. All Rights Reserved. Privacy Policy. Terms of Use. Report Abuse.