June 9, 2015 at 7:08 am
I have a view defined as:
CREATE VIEW [san].[vw_Switch_Host]
AS
SELECT DISTINCT S.NAME AS Frame
,H.NAME AS Host
,SW.NAME AS Switch
,SWP.NAME AS PortName
,SWP.STATUS AS PortStatus
FROM san.storage AS S
LEFT JOIN san.logical AS L ON L.storageId = S.id
LEFT JOIN san.host AS H ON H.id = L.hostid
LEFT JOIN san.host_port AS HP ON HP.hostId = L.hostid
LEFT JOIN san.port_connectivity AS PC ON PC.wwn = HP.wwn
LEFT JOIN san.switch_port AS SWP ON SWP.wwn = PC.connectedWwn
LEFT JOIN san.switch AS SW ON SW.id = SWP.switchId
WHERE (H.NAME IS NOT NULL)
AND (PC.connectedType = 'SWITCH_PORT')
GO
When I run this view in SSMS using:
select * from Source_Staging.san.vw_Switch_Host
The query returns about 18000 rows in 3 seconds. I also have an agent job that runs the same query and uses it to populate a table. The target table is truncated first. When Agent runs the job, it typically runs for 2-3 hours. As of this writing the job, which started just after 6 am, is still running -- 3 hours later. It is also racking up the read counts (currently at 147,589,762).
While this job is still running, I have rerun the query in SSMS multiple times, each running about 3 seconds and doing io proportional to the rowset returned.
I don't get this. I'm looking for clues as to where to look.
How can an Agent job using this view run for hours and use so much I/O when -- at the same time -- I can query the view again and again with very fast resuilts?
Where should I look?
Gerald Britton, Pluralsight courses
June 9, 2015 at 7:28 am
The SQL Agent job is probably running a different execution plan compared to the one you get when you run the query in SSMS.
The reason might be one among:
* stale statistics compared to the time when you ran the query in SSMS
* different SET options
* parameter sniffing (any parameters here? what's the parameterization setting of the database?)
The go-to resource for this kind of issues is Erland Sommarskog's fine article on execution plan mysteries: http://www.sommarskog.se/query-plan-mysteries.html
-- Gianluca Sartori
June 9, 2015 at 7:45 am
Since the other query is loading a table and you're just running a SELECT statement, the problem could be in the table load. Contention on the disk, resource contention somewhere else. I'd look at sys.dm_exec_requests and see what the heck is causing the query to run slow (last wait is listed there) or see if it's blocked by another process (stuff trying to read from the table as it's being created maybe?). Use extended events to capture wait metrics during the next run of the process. It might not be your view.
Do you really need DISTINCT? It's an aggregation command and that will certainly add to the overhead of the query itself.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
June 9, 2015 at 8:19 am
spaghettidba (6/9/2015)
The SQL Agent job is probably running a different execution plan compared to the one you get when you run the query in SSMS.The reason might be one among:
* stale statistics compared to the time when you ran the query in SSMS
* different SET options
* parameter sniffing (any parameters here? what's the parameterization setting of the database?)
The go-to resource for this kind of issues is Erland Sommarskog's fine article on execution plan mysteries: http://www.sommarskog.se/query-plan-mysteries.html
You've hit the nail on the head. There are two execution plans in the cache for the same query. On uses hash, the other (the bad one) nested loops. I can reproduce this at will:
1. Run the query from SSMS (fast). Look in the plan cache, see the "good" plan (hash)
2. Run the Agent Job (Actually, run the SSIS package that the Agent job runs). Identical query text, yet another "bad" plan (loops) is added to the cache.
Fun fact: The bad plan shows
Reason For Early Termination Of Statement Optimization: Time Out
Now, I have to figure out why the first, good plan was not resused and a second, bad plan was compiled instead. Also not clear: why the optimizer timed out on the second plan.
Gerald Britton, Pluralsight courses
June 9, 2015 at 8:23 am
Differences in parameter values or WHERE clause criteria? Use of hard coded values or parameters (either of which is sniffed) vs. local variables (which are not, except during a recompile)? Differences in ANSI connection settings?
Those are the most common issues.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
June 9, 2015 at 8:44 am
Grant Fritchey (6/9/2015)
Differences in parameter values or WHERE clause criteria? Use of hard coded values or parameters (either of which is sniffed) vs. local variables (which are not, except during a recompile)? Differences in ANSI connection settings?Those are the most common issues.
Good guesses but that's not it! The good plan is generated with ARITH_ABORT ON, but ARITH_ABORT OFF is the default for the database.
Difference in options == no plan reuse!
This article explains the problem: SET ARITHABORT (Transact-SQL)
Now I have to figure out a way to force the good plan reuse
Gerald Britton, Pluralsight courses
June 9, 2015 at 8:50 am
g.britton (6/9/2015)
Now, I have to figure out why the first, good plan was not resused and a second, bad plan was compiled instead.
If you correlate the plan from the cache with sys.dm_exec_plan_attributes you should see what the set options cached for that plan look like.
I use this query when I have to investigate this kind of issues:
Fist create this function in tempdb:
USE tempdb;
GO
IF OBJECT_ID('fn_QueryTextFromHandle') IS NOT NULL
DROP FUNCTION fn_QueryTextFromHandle;
GO
CREATE FUNCTION [fn_QueryTextFromHandle] (
@handle VARBINARY(64)
,@statement_start_offset INT
,@statement_end_offset INT
)
RETURNS TABLE
AS
RETURN (
WITH offs AS (
SELECT [start] = ISNULL(@statement_start_offset, 0)
,[end] =
CASE
WHEN @statement_end_offset IS NULL OR @statement_end_offset = 0
THEN - 1
ELSE @statement_end_offset
END
),
sql_info AS (
SELECT [dbid] = t.dbid
,[objectid] = t.objectid
,[encrypted] = t.encrypted
,[batch] = t.text
FROM sys.dm_exec_sql_text(@handle) AS t
),
queries AS (
SELECT
[dbid]
,[objectid]
,[encrypted]
,[batch]
,[query] =
CASE
WHEN encrypted = CAST(1 AS BIT)
THEN N'encrypted text'
ELSE LTRIM(SUBSTRING([batch], [start] / 2 + 1, (
(
CASE
WHEN [end] = - 1
THEN DATALENGTH([batch])
ELSE [end]
END
) - [start]
) / 2))
END
FROM offs
CROSS JOIN sql_info
)
SELECT
[dbid]
,[objectid]
,[encrypted]
,[query] =
CASE
WHEN DATALENGTH([query]) = 0
THEN [batch]
ELSE [query]
END
FROM queries
WHERE @handle IS NOT NULL
)
GO
Then you can use this query to identify the offending plans:
-- date: 2012/10/26
-- author: Gianluca Sartori - @spaghettidba
--
-- Estrae i "TOP expensive statement" dalla plan cache
--
-- Gli statement possono essere ordinati per:
-- avg_elapsed_time
-- avg_worker_time
-- avg_logical_reads
-- avg_logical_writes
-- avg_physical_reads
WITH QUERY_STATS AS (
SELECT execution_count
,total_elapsed_time / execution_count AS avg_elapsed_time
,total_worker_time / execution_count AS avg_worker_time
,total_logical_reads / execution_count AS avg_logical_reads
,total_logical_writes / execution_count AS avg_logical_writes
,total_physical_reads / execution_count AS avg_physical_reads
,stats.statement_start_offset
,stats.statement_end_offset
,sql.query AS stmt
,(
-- Estrae il singolo statement dal testo complessivo del batch
-- e converte lo statement in un XML, per attivare l'hyperlink in SSMS <<
SELECT CHAR(10) + query + CHAR(10) AS [processing-instruction(query)]
FOR XML PATH('')
,TYPE
) AS statement
,DB_NAME(sql.dbid) + '.' + OBJECT_SCHEMA_NAME(sql.objectid, sql.dbid) + '.' + OBJECT_NAME(sql.objectid, sql.dbid) AS object_name
,CAST(q_plan.query_plan AS xml) AS query_plan
,set_options.set_options
,stats.plan_handle
FROM sys.dm_exec_query_stats AS stats
CROSS APPLY fn_QueryTextFromHandle(stats.sql_handle, stats.statement_start_offset, stats.statement_end_offset) AS sql
CROSS APPLY sys.dm_exec_text_query_plan(stats.plan_handle,stats.statement_start_offset,stats.statement_end_offset) AS q_plan
CROSS APPLY sys.dm_exec_plan_attributes(stats.plan_handle) AS pa
CROSS APPLY (
SELECT STUFF((
SELECT *
FROM (
SELECT ',ANSI_PADDING' AS [text()] WHERE CAST(pa.value AS int) & 1 = 1
UNION ALL SELECT ',PARALLEL PLAN' WHERE CAST(pa.value AS int) & 2 = 1
UNION ALL SELECT ',FORCEPLAN' WHERE CAST(pa.value AS int) & 4 = 4
UNION ALL SELECT ',CONCAT_NULL_YIELDS_NULL' WHERE CAST(pa.value AS int) & 8 = 8
UNION ALL SELECT ',ANSI_WARNINGS' WHERE CAST(pa.value AS int) & 16 = 16
UNION ALL SELECT ',ANSI_NULLS' WHERE CAST(pa.value AS int) & 32 = 32
UNION ALL SELECT ',QUOTED_IDENTIFIER' WHERE CAST(pa.value AS int) & 64 = 64
UNION ALL SELECT ',ANSI_NULL_DFLT_ON' WHERE CAST(pa.value AS int) & 128 = 128
UNION ALL SELECT ',ANSI_NULL_DFLT_OFF' WHERE CAST(pa.value AS int) & 256 = 256
UNION ALL SELECT ',NOBROWSETABLE' WHERE CAST(pa.value AS int) & 512 = 512
UNION ALL SELECT ',TRIGGERONEROW' WHERE CAST(pa.value AS int) & 1024 = 1024
UNION ALL SELECT ',RESYNCQUERY' WHERE CAST(pa.value AS int) & 2048 = 2048
UNION ALL SELECT ',ARITHABORT' WHERE CAST(pa.value AS int) & 4096 = 4096
UNION ALL SELECT ',NUMERIC_ROUNDABORT' WHERE CAST(pa.value AS int) & 8192 = 8192
UNION ALL SELECT ',DATEFIRST' WHERE CAST(pa.value AS int) & 16384 = 16384
UNION ALL SELECT ',DATEFORMAT' WHERE CAST(pa.value AS int) & 32768 = 32768
UNION ALL SELECT ',LANGUAGEID' WHERE CAST(pa.value AS int) & 65536 = 65536
UNION ALL SELECT ',UPON' WHERE CAST(pa.value AS int) & 131072 = 131072
) AS src
ORDER BY 1
FOR XML PATH('')
),1,1,SPACE(0))
) AS set_options(set_options)
WHERE 1 = 1
AND pa.attribute = 'set_options'
)
SELECT *
FROM QUERY_STATS
WHERE 1 = 1
AND CAST(statement AS nvarchar(max)) LIKE '%vw_Switch_Host%'
Look at the set_options column: you should see different options if my guess is correct.
-- Gianluca Sartori
June 9, 2015 at 9:05 am
Look at the set_options column: you should see different options if my guess is correct.
Yup, it's all about ARITHABORT. The default for the database is OFF. The default for SSMS is ON. Different options, different plans.
Now my problem is how to fix it. I want the good plan (obviously) but don't want to mess with the database settings (sure to make something break).
Interesting side-note:
I set up a new instance of SS 2012, and I can see that the default setting in Model is ARITHABORT OFF. this, in spite of Microsoft's warnings about it.
Gerald Britton, Pluralsight courses
June 9, 2015 at 9:12 am
ARITHABORT is a client setting that can be changed at any moment.
If you change your SSMS connection setting and clear the cache for that plan ( DBCC FREEPROCCACHE(plan_handle) ) does a good plan appear?
-- Gianluca Sartori
June 9, 2015 at 9:15 am
spaghettidba (6/9/2015)
ARITHABORT is a client setting that can be changed at any moment.If you change your SSMS connection setting and clear the cache for that plan ( DBCC FREEPROCCACHE(plan_handle) ) does a good plan appear?
Actually, it's the opposite. If I change my SSMS setting to ARITHABORT OFF, the optimizer times out building the plan and leaves me with the nested loops plan (bad).
What I'm going to try is force ARITHABORT ON when I run the query from my SSIS package.
Update: I inserted a SET ARITHABORT ON statement before my SELECT in my package and got the good plan!
Gerald Britton, Pluralsight courses
June 9, 2015 at 9:18 am
g.britton (6/9/2015)
Grant Fritchey (6/9/2015)
Differences in parameter values or WHERE clause criteria? Use of hard coded values or parameters (either of which is sniffed) vs. local variables (which are not, except during a recompile)? Differences in ANSI connection settings?Those are the most common issues.
Good guesses but that's not it! The good plan is generated with ARITH_ABORT ON, but ARITH_ABORT OFF is the default for the database.
Difference in options == no plan reuse!
This article explains the problem: SET ARITHABORT (Transact-SQL)
Now I have to figure out a way to force the good plan reuse
I recognize that I'm dense, but how is my suggestion that the ANSI settings are different wrong? ARITH_ABORT is part of the ANSI connection settings. You're saying that's the cause.
And yeah, I know that differences in these settings leads to differences in the execution plans, which is why i suggested it.
As to forcing plan use, the single best bet is to get the connection settings right from your Agent. To do that, believe it or not, it's unlikely to use the defaults for the databse, although you can try. Instead you may just have to use the SET command when you call the code.
Another, less savory, option would be to try plan forcing through plan guides.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
June 9, 2015 at 9:53 am
Grant Fritchey (6/9/2015)
g.britton (6/9/2015)
Grant Fritchey (6/9/2015)
Differences in parameter values or WHERE clause criteria? Use of hard coded values or parameters (either of which is sniffed) vs. local variables (which are not, except during a recompile)? Differences in ANSI connection settings?Those are the most common issues.
Good guesses but that's not it! The good plan is generated with ARITH_ABORT ON, but ARITH_ABORT OFF is the default for the database.
Difference in options == no plan reuse!
This article explains the problem: SET ARITHABORT (Transact-SQL)
Now I have to figure out a way to force the good plan reuse
I recognize that I'm dense, but how is my suggestion that the ANSI settings are different wrong? ARITH_ABORT is part of the ANSI connection settings. You're saying that's the cause.
You're right about that, I'm wrong
And yeah, I know that differences in these settings leads to differences in the execution plans, which is why i suggested it.
As to forcing plan use, the single best bet is to get the connection settings right from your Agent. To do that, believe it or not, it's unlikely to use the defaults for the databse, although you can try. Instead you may just have to use the SET command when you call the code.
Another, less savory, option would be to try plan forcing through plan guides.
And that's how I fixed it more or less (Set command, not plan guide). Just added SET ARITHABORT ON to the code in my SSIS package querying the view.
Gerald Britton, Pluralsight courses
June 9, 2015 at 10:00 am
Glad you sorted it out
-- Gianluca Sartori
June 9, 2015 at 12:56 pm
I agree. It's always good to know what ultimately worked.
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt
Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning
June 10, 2015 at 3:32 pm
Put in a query hint to force a HASH join. I'd rely more on that than on an ARITHABORT setting.
SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".
Viewing 15 posts - 1 through 15 (of 16 total)
You must be logged in to reply to this topic. Login to reply