Looking for clues on a performance problem

Question

Looking for clues on a performance problem

g.britton

SSChampion

Points: 13863
More actions
June 9, 2015 at 7:08 am

#318062

I have a view defined as:
CREATE VIEW [san].[vw_Switch_Host]
AS
SELECT DISTINCT S.NAME AS Frame
,H.NAME AS Host
,SW.NAME AS Switch
,SWP.NAME AS PortName
,SWP.STATUS AS PortStatus
FROM san.storage AS S
LEFT JOIN san.logical AS L ON L.storageId = S.id
LEFT JOIN san.host AS H ON H.id = L.hostid
LEFT JOIN san.host_port AS HP ON HP.hostId = L.hostid
LEFT JOIN san.port_connectivity AS PC ON PC.wwn = HP.wwn
LEFT JOIN san.switch_port AS SWP ON SWP.wwn = PC.connectedWwn
LEFT JOIN san.switch AS SW ON SW.id = SWP.switchId
WHERE (H.NAME IS NOT NULL)
AND (PC.connectedType = 'SWITCH_PORT')
GO
When I run this view in SSMS using:
select * from Source_Staging.san.vw_Switch_Host
The query returns about 18000 rows in 3 seconds. I also have an agent job that runs the same query and uses it to populate a table. The target table is truncated first. When Agent runs the job, it typically runs for 2-3 hours. As of this writing the job, which started just after 6 am, is still running -- 3 hours later. It is also racking up the read counts (currently at 147,589,762).
While this job is still running, I have rerun the query in SSMS multiple times, each running about 3 seconds and doing io proportional to the rowset returned.
I don't get this. I'm looking for clues as to where to look.
How can an Agent job using this view run for hours and use so much I/O when -- at the same time -- I can query the view again and again with very fast resuilts?
Where should I look?
Gerald Britton, Pluralsight courses

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

spaghettidba SSC Guru Points: 105732 More actions · Answer 1

The SQL Agent job is probably running a different execution plan compared to the one you get when you run the query in SSMS.

The reason might be one among:

* stale statistics compared to the time when you ran the query in SSMS

* different SET options

* parameter sniffing (any parameters here? what's the parameterization setting of the database?)

The go-to resource for this kind of issues is Erland Sommarskog's fine article on execution plan mysteries: http://www.sommarskog.se/query-plan-mysteries.html

-- Gianluca Sartori

Grant Fritchey SSC Guru Points: 399330 More actions · Answer 2

Since the other query is loading a table and you're just running a SELECT statement, the problem could be in the table load. Contention on the disk, resource contention somewhere else. I'd look at sys.dm_exec_requests and see what the heck is causing the query to run slow (last wait is listed there) or see if it's blocked by another process (stuff trying to read from the table as it's being created maybe?). Use extended events to capture wait metrics during the next run of the process. It might not be your view.

Do you really need DISTINCT? It's an aggregation command and that will certainly add to the overhead of the query itself.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

g.britton SSChampion Points: 13863 More actions · Answer 3

spaghettidba (6/9/2015)
The SQL Agent job is probably running a different execution plan compared to the one you get when you run the query in SSMS.
The reason might be one among:
* stale statistics compared to the time when you ran the query in SSMS
* different SET options
* parameter sniffing (any parameters here? what's the parameterization setting of the database?)
The go-to resource for this kind of issues is Erland Sommarskog's fine article on execution plan mysteries: http://www.sommarskog.se/query-plan-mysteries.html

You've hit the nail on the head. There are two execution plans in the cache for the same query. On uses hash, the other (the bad one) nested loops. I can reproduce this at will:

1. Run the query from SSMS (fast). Look in the plan cache, see the "good" plan (hash)

2. Run the Agent Job (Actually, run the SSIS package that the Agent job runs). Identical query text, yet another "bad" plan (loops) is added to the cache.

Fun fact: The bad plan shows

Reason For Early Termination Of Statement Optimization: Time Out

Now, I have to figure out why the first, good plan was not resused and a second, bad plan was compiled instead. Also not clear: why the optimizer timed out on the second plan.

Gerald Britton, Pluralsight courses

Grant Fritchey SSC Guru Points: 399330 More actions · Answer 4

Differences in parameter values or WHERE clause criteria? Use of hard coded values or parameters (either of which is sniffed) vs. local variables (which are not, except during a recompile)? Differences in ANSI connection settings?

Those are the most common issues.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

g.britton SSChampion Points: 13863 More actions · Answer 5

Grant Fritchey (6/9/2015)
Differences in parameter values or WHERE clause criteria? Use of hard coded values or parameters (either of which is sniffed) vs. local variables (which are not, except during a recompile)? Differences in ANSI connection settings?
Those are the most common issues.

Good guesses but that's not it! The good plan is generated with ARITH_ABORT ON, but ARITH_ABORT OFF is the default for the database.

Difference in options == no plan reuse!

This article explains the problem: SET ARITHABORT (Transact-SQL)

Now I have to figure out a way to force the good plan reuse

Gerald Britton, Pluralsight courses

spaghettidba SSC Guru Points: 105732 More actions · Answer 6

g.britton (6/9/2015)
Now, I have to figure out why the first, good plan was not resused and a second, bad plan was compiled instead.

If you correlate the plan from the cache with sys.dm_exec_plan_attributes you should see what the set options cached for that plan look like.

I use this query when I have to investigate this kind of issues:

Fist create this function in tempdb:

USE tempdb;

GO

IF OBJECT_ID('fn_QueryTextFromHandle') IS NOT NULL

DROP FUNCTION fn_QueryTextFromHandle;

GO

CREATE FUNCTION [fn_QueryTextFromHandle] (

@handle VARBINARY(64)

,@statement_start_offset INT

,@statement_end_offset INT

)

RETURNS TABLE

AS

RETURN (

WITH offs AS (

SELECT [start] = ISNULL(@statement_start_offset, 0)

,[end] =

CASE

WHEN @statement_end_offset IS NULL OR @statement_end_offset = 0

THEN - 1

ELSE @statement_end_offset

END

),

sql_info AS (

SELECT [dbid] = t.dbid

,[objectid] = t.objectid

,[encrypted] = t.encrypted

,[batch] = t.text

FROM sys.dm_exec_sql_text(@handle) AS t

),

queries AS (

SELECT

[dbid]

,[objectid]

,[encrypted]

,[batch]

,[query] =

CASE

WHEN encrypted = CAST(1 AS BIT)

THEN N'encrypted text'

ELSE LTRIM(SUBSTRING([batch], [start] / 2 + 1, (

(

CASE

WHEN [end] = - 1

THEN DATALENGTH([batch])

ELSE [end]

END

) - [start]

) / 2))

END

FROM offs

CROSS JOIN sql_info

)

SELECT

[dbid]

,[objectid]

,[encrypted]

,[query] =

CASE

WHEN DATALENGTH([query]) = 0

THEN [batch]

ELSE [query]

END

FROM queries

WHERE @handle IS NOT NULL

)

GO

Then you can use this query to identify the offending plans:

-- date: 2012/10/26

-- author: Gianluca Sartori - @spaghettidba

--

-- Estrae i "TOP expensive statement" dalla plan cache

--

-- Gli statement possono essere ordinati per:

-- avg_elapsed_time

-- avg_worker_time

-- avg_logical_reads

-- avg_logical_writes

-- avg_physical_reads

WITH QUERY_STATS AS (

SELECT execution_count

,total_elapsed_time / execution_count AS avg_elapsed_time

,total_worker_time / execution_count AS avg_worker_time

,total_logical_reads / execution_count AS avg_logical_reads

,total_logical_writes / execution_count AS avg_logical_writes

,total_physical_reads / execution_count AS avg_physical_reads

,stats.statement_start_offset

,stats.statement_end_offset

,sql.query AS stmt

,(

-- Estrae il singolo statement dal testo complessivo del batch

-- e converte lo statement in un XML, per attivare l'hyperlink in SSMS <<

SELECT CHAR(10) + query + CHAR(10) AS [processing-instruction(query)]

FOR XML PATH('')

,TYPE

) AS statement

,DB_NAME(sql.dbid) + '.' + OBJECT_SCHEMA_NAME(sql.objectid, sql.dbid) + '.' + OBJECT_NAME(sql.objectid, sql.dbid) AS object_name

,CAST(q_plan.query_plan AS xml) AS query_plan

,set_options.set_options

,stats.plan_handle

FROM sys.dm_exec_query_stats AS stats

CROSS APPLY fn_QueryTextFromHandle(stats.sql_handle, stats.statement_start_offset, stats.statement_end_offset) AS sql

CROSS APPLY sys.dm_exec_text_query_plan(stats.plan_handle,stats.statement_start_offset,stats.statement_end_offset) AS q_plan

CROSS APPLY sys.dm_exec_plan_attributes(stats.plan_handle) AS pa

CROSS APPLY (

SELECT STUFF((

SELECT *

FROM (

SELECT ',ANSI_PADDING' AS [text()] WHERE CAST(pa.value AS int) & 1 = 1

UNION ALL SELECT ',PARALLEL PLAN' WHERE CAST(pa.value AS int) & 2 = 1

UNION ALL SELECT ',FORCEPLAN' WHERE CAST(pa.value AS int) & 4 = 4

UNION ALL SELECT ',CONCAT_NULL_YIELDS_NULL' WHERE CAST(pa.value AS int) & 8 = 8

UNION ALL SELECT ',ANSI_WARNINGS' WHERE CAST(pa.value AS int) & 16 = 16

UNION ALL SELECT ',ANSI_NULLS' WHERE CAST(pa.value AS int) & 32 = 32

UNION ALL SELECT ',QUOTED_IDENTIFIER' WHERE CAST(pa.value AS int) & 64 = 64

UNION ALL SELECT ',ANSI_NULL_DFLT_ON' WHERE CAST(pa.value AS int) & 128 = 128

UNION ALL SELECT ',ANSI_NULL_DFLT_OFF' WHERE CAST(pa.value AS int) & 256 = 256

UNION ALL SELECT ',NOBROWSETABLE' WHERE CAST(pa.value AS int) & 512 = 512

UNION ALL SELECT ',TRIGGERONEROW' WHERE CAST(pa.value AS int) & 1024 = 1024

UNION ALL SELECT ',RESYNCQUERY' WHERE CAST(pa.value AS int) & 2048 = 2048

UNION ALL SELECT ',ARITHABORT' WHERE CAST(pa.value AS int) & 4096 = 4096

UNION ALL SELECT ',NUMERIC_ROUNDABORT' WHERE CAST(pa.value AS int) & 8192 = 8192

UNION ALL SELECT ',DATEFIRST' WHERE CAST(pa.value AS int) & 16384 = 16384

UNION ALL SELECT ',DATEFORMAT' WHERE CAST(pa.value AS int) & 32768 = 32768

UNION ALL SELECT ',LANGUAGEID' WHERE CAST(pa.value AS int) & 65536 = 65536

UNION ALL SELECT ',UPON' WHERE CAST(pa.value AS int) & 131072 = 131072

) AS src

ORDER BY 1

FOR XML PATH('')

),1,1,SPACE(0))

) AS set_options(set_options)

WHERE 1 = 1

AND pa.attribute = 'set_options'

)

SELECT *

FROM QUERY_STATS

WHERE 1 = 1

AND CAST(statement AS nvarchar(max)) LIKE '%vw_Switch_Host%'

Look at the set_options column: you should see different options if my guess is correct.

-- Gianluca Sartori

g.britton SSChampion Points: 13863 More actions · Answer 7

Look at the set_options column: you should see different options if my guess is correct.

Yup, it's all about ARITHABORT. The default for the database is OFF. The default for SSMS is ON. Different options, different plans.

Now my problem is how to fix it. I want the good plan (obviously) but don't want to mess with the database settings (sure to make something break).

Interesting side-note:

I set up a new instance of SS 2012, and I can see that the default setting in Model is ARITHABORT OFF. this, in spite of Microsoft's warnings about it.

Gerald Britton, Pluralsight courses

spaghettidba SSC Guru Points: 105732 More actions · Answer 8

ARITHABORT is a client setting that can be changed at any moment.

If you change your SSMS connection setting and clear the cache for that plan ( DBCC FREEPROCCACHE(plan_handle) ) does a good plan appear?

-- Gianluca Sartori

g.britton SSChampion Points: 13863 More actions · Answer 9

spaghettidba (6/9/2015)
ARITHABORT is a client setting that can be changed at any moment.
If you change your SSMS connection setting and clear the cache for that plan ( DBCC FREEPROCCACHE(plan_handle) ) does a good plan appear?

Actually, it's the opposite. If I change my SSMS setting to ARITHABORT OFF, the optimizer times out building the plan and leaves me with the nested loops plan (bad).

What I'm going to try is force ARITHABORT ON when I run the query from my SSIS package.

Update: I inserted a SET ARITHABORT ON statement before my SELECT in my package and got the good plan!

Gerald Britton, Pluralsight courses

Grant Fritchey SSC Guru Points: 399330 More actions · Answer 10

g.britton (6/9/2015)
Grant Fritchey (6/9/2015)
Differences in parameter values or WHERE clause criteria? Use of hard coded values or parameters (either of which is sniffed) vs. local variables (which are not, except during a recompile)? Differences in ANSI connection settings?
Those are the most common issues.
Good guesses but that's not it! The good plan is generated with ARITH_ABORT ON, but ARITH_ABORT OFF is the default for the database.
Difference in options == no plan reuse!
This article explains the problem: SET ARITHABORT (Transact-SQL)
Now I have to figure out a way to force the good plan reuse

I recognize that I'm dense, but how is my suggestion that the ANSI settings are different wrong? ARITH_ABORT is part of the ANSI connection settings. You're saying that's the cause.

And yeah, I know that differences in these settings leads to differences in the execution plans, which is why i suggested it.

As to forcing plan use, the single best bet is to get the connection settings right from your Agent. To do that, believe it or not, it's unlikely to use the defaults for the databse, although you can try. Instead you may just have to use the SET command when you call the code.

Another, less savory, option would be to try plan forcing through plan guides.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

g.britton SSChampion Points: 13863 More actions · Answer 11

Grant Fritchey (6/9/2015)
g.britton (6/9/2015)
Grant Fritchey (6/9/2015)
Differences in parameter values or WHERE clause criteria? Use of hard coded values or parameters (either of which is sniffed) vs. local variables (which are not, except during a recompile)? Differences in ANSI connection settings?
Those are the most common issues.
Good guesses but that's not it! The good plan is generated with ARITH_ABORT ON, but ARITH_ABORT OFF is the default for the database.
Difference in options == no plan reuse!
This article explains the problem: SET ARITHABORT (Transact-SQL)
Now I have to figure out a way to force the good plan reuse
I recognize that I'm dense, but how is my suggestion that the ANSI settings are different wrong? ARITH_ABORT is part of the ANSI connection settings. You're saying that's the cause.

You're right about that, I'm wrong

And yeah, I know that differences in these settings leads to differences in the execution plans, which is why i suggested it.
As to forcing plan use, the single best bet is to get the connection settings right from your Agent. To do that, believe it or not, it's unlikely to use the defaults for the databse, although you can try. Instead you may just have to use the SET command when you call the code.
Another, less savory, option would be to try plan forcing through plan guides.

And that's how I fixed it more or less (Set command, not plan guide). Just added SET ARITHABORT ON to the code in my SSIS package querying the view.

Gerald Britton, Pluralsight courses

spaghettidba SSC Guru Points: 105732 More actions · Answer 12

spaghettidba

SSC Guru

Points: 105732

June 9, 2015 at 10:00 am

#1804018

Glad you sorted it out

-- Gianluca Sartori

Grant Fritchey SSC Guru Points: 399330 More actions · Answer 13

I agree. It's always good to know what ultimately worked.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

ScottPletcher SSC Guru Points: 101205 More actions · Answer 14

Put in a query hint to force a HASH join. I'd rely more on that than on an ARITHABORT setting.

SQL DBA,SQL Server MVP(07, 08, 09) "It's a dog-eat-dog world, and I'm wearing Milk-Bone underwear." "Norm", on "Cheers". Also from "Cheers", from "Carla": "You need to know 3 things about Tortelli men: Tortelli men draw women like flies; Tortelli men treat women like flies; Tortelli men's brains are in their flies".