Per Call Performance Monitoring

Question

Per Call Performance Monitoring

Ificator

Valued Member

Points: 69
More actions
October 30, 2011 at 2:37 pm

#245310

I work on a product that utilizes a large number of databases to store user data. The data is partitioned across the databases at a user level, so we guarantee that everything related to a specific user resides in a single database. On top of that, the data stored per user can vary, resulting in different performance of our stored procedures between databases (statistic skew) and between users (data set sizes). This difference can sometimes be extreme, and so I want to come up with a mechanism to collect the following data per sproc call so that we can detect issues as early as possible:
1. Execution time
2. CPU time
3. IO
I can get the execution time from the machine calling the database, but so far I have yet to find an inexpensive mechanism to collect the CPU and IO information. Thoughts I've had:
1. Using "set statistics io on" and "set statistics time on", parsing the message output, and generating totals for the values I care about
- Doable but awkward
2. Using sys.dm_exec_sessions to harvest before and after values and return a new result set with the performance stats
- It doesn't appear as though the DMV updates fast enough to get meaningful data (I'm seeing 0s come back)
3. Hook into the profiling system to try and match "RPC:Completed" events to their calls
- Unrealistic... i can't fire up a profile session and tear it down for ever sproc call I make
Has anyone else had do to something similar to this before? I'd appreciate any ideas on how I could collect the data I'm after

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004514 More actions · Answer 1

Extended events?

But similar to profiler it's more something you'd have running for a period and analyse in general rather than doing that online.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Ificator Valued Member Points: 69 More actions · Answer 2

Thanks for the suggestion, I wasn't aware of Extended Events and it's definitely something that will no doubt be useful for me at some point. However, given that I want to always be collecting the performance information it does feel a little heavy weight. So far it seems like parsing the message output is my best approach, as cumbersome as it is...

Alexander Suprun SSCertifiable Points: 6253 More actions · Answer 3

Ificator (10/30/2011)
I work on a product that utilizes a large number of databases to store user data. The data is partitioned across the databases at a user level, so we guarantee that everything related to a specific user resides in a single database. On top of that, the data stored per user can vary, resulting in different performance of our stored procedures between databases (statistic skew) and between users (data set sizes). This difference can sometimes be extreme, and so I want to come up with a mechanism to collect the following data per sproc call so that we can detect issues as early as possible:
1. Execution time
2. CPU time
3. IO

Is it what you are looking for?

select

DB_NAME(qt.[dbid]) AS [DB Name],

OBJECT_SCHEMA_NAME(qt.objectid, qt.[dbid]) + '.' + OBJECT_NAME(qt.objectid, qt.[dbid]) AS [SP Name],

SUBSTRING(qt.[text],qs.statement_start_offset/2,

(CASE

WHEN qs.statement_end_offset = -1

THEN LEN(CONVERT(nvarchar(max), qt.[text])) * 2

ELSE qs.statement_end_offset

END - qs.statement_start_offset)/2) AS [SQL Statement],

qs.execution_count,

ISNULL(1.0*qs.execution_count/DATEDIFF(Second, qs.creation_time, GETDATE()), 0) AS [Calls/Second],

-- 1. Execution time

qs.total_elapsed_time,

qs.last_elapsed_time,

qs.min_elapsed_time,

qs.max_elapsed_time,

qs.total_elapsed_time/NULLIF(execution_count,0) as avg_elapsed_time,

-- 2. CPU time

qs.total_worker_time,

qs.last_worker_time,

qs.min_worker_time,

qs.max_worker_time,

qs.total_worker_time/NULLIF(execution_count,0) as avg_worker_time,

-- 3. IO

qs.total_physical_reads,

qs.last_physical_reads ,

qs.min_physical_reads ,

qs.max_physical_reads,

qs.max_physical_reads/NULLIF(execution_count,0) as avg_physical_reads,

qs.total_logical_writes,

qs.last_logical_writes ,

qs.min_logical_writes ,

qs.max_logical_writes,

qs.max_logical_writes/NULLIF(execution_count,0) as avg_logical_writes,

qs.total_logical_reads ,

qs.last_logical_reads ,

qs.min_logical_reads ,

qs.max_logical_reads,

qs.max_logical_reads/NULLIF(execution_count,0) as avg_logical_reads

FROM sys.dm_exec_query_stats qs

CROSS APPLY sys.dm_exec_sql_text(qs.[sql_handle]) AS qt

You can do a group by [SP Name] if you don't need the statement level details.

Alex Suprun

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 4

Profiler to local disk files, then load into database for analytics. Do it all the time. Very light weight going to disk. All you need is tsql/rpc batch completed events and a few ancillary columns such as databaseid and starttime. Once in a database table you can also do aggregate analysis, outlier analysis, etc.

You can also pony up for one of numerous third party tools to do this for you.

BTW, profiler does provide you with parameter values for the stored procedure calls, so you can know which users, etc are having issues.

Oh, and I will add that either dynamic SQL (be sure to guard against sql injection) or OPTION (RECOMPILE) will likely be your friend when dealing with data skew. Do that all the time for clients too. 😎 It is almost always worth it to give up some CPU ticks and plan cache (which can be regularly flushed btw) to avoid the disastrously bad plans you can get hammered with when you have significant data skew.

Oh 2: you aren't using table variables either, right??

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service