SQLServerCentral Article

Monitoring Performance

,

DBAs are in charge of performance on production SQL servers and, sure, they hate to hear complains from end-users that the system is slow. But the truth is that often the complains are not groundless. As long as for developers performance is a last thing to care about after delivering a required functionality, providing wonderful interface, debugging and loud celebrating a new release shipped to production, then we have what we have. As a result, managers use to ask “What’s going on on the server?” and DBAs really need to have a clear and accurate answer.

Getting active processes

For myself I created a stored procedure sp_ActiveProcesses (Listing 1) showing what's running on the SQL Server. The sp has one parameter - time interval - to retrieve snapshot of running processes. Default is 5 sec. It can be decreased to make the snapshot more instantaneous or increased to catch consumers for a longer term. A process to hit into the snapshot must to be running at the beginning and at the end of the period. It's why there is no sense to make the interval too long (though maximum allowed is 59 sec).

I run the sp so often that I had to create a shortcut in my Query Analyzer (Tools/Customize/Custom ...). Install the sp, set the shortcut assigning the sp, say, to Ctrl-4, press these two buttons and you've got the picture (the list of processes wrapped to fit on the page):

ProcessIdTotalCPUCPU_ConsumedInTheTimeFragmentTotalPhysical_IOIO_InTheTimeFragment...
---------------------------------------------------------------------------------------...
552391092110...
8531328317521536...
8856781001795164...

...HostnameApplicationNameNT_LoginNameDatabaseNameSPIDBuffer
--------------------------------------------------------------------------
BillSMSPCompany\BillMSPGetContacts
KirkAMSPCompany\KirkMSPReassignTimeApprover
KimNMSPCompany\KimMSPInsertExpense

TheFragmentDurationNumberOfCPUsSUM_CPU_ConsumedSUM_Physical_IO_Committed
----------------------------------------------------------------------------------------
512321141710

Maximum SUM_CPU_Consumed can be TheFragmentDuration times NumberOfCPUs. In the example above it is 10246 milliseconds.

Just one note about accuracy of the numbers being showed by the sp. Microsoft Knowledge Base Article - 309377 says: In Microsoft SQL Server 2000 (all editions) CPU time for a particular server process ID (SPID) may not accumulate correctly.

It is explained as: SQL Server maintains a pool of worker threads that execute the tasks given to it. SQL Server may assign a different thread from this pool to a given SPID to execute a new query batch. When SQL Server assigns a different thread to a SPID, SQL Server does not properly calculate the CPU time to indicate the accumulated CPU time up to that point for the SPID.

Microsoft has confirmed this to be a problem in SQL Server 2000. This problem was first corrected in Microsoft SQL Server 2000 Service Pack 2.

The wrong calculation almost never happens to user processes, which are most important for us, even there is no SP2 for SQL Server 2000 installed. In contrary, I saw lots of such cases with system processes, for instance, replication agents.

Getting blocked processes

There are two kinds of SQL Server processes in the scope of DBA's concern:

1. Consumers

2. Long-runners

Consumers eat up server resources. By identifying and optimizing the consuming codes you can free some room for other processes. Long-runners are codes having outstanding duration. Exactly they make end-users uncomfortable. However for developers to provide optimization it is more important "what consumes more" than "what works longer". Because sometimes innocent processes take minutes being blocked by other processes (or maybe not blocked, but just waiting for resources - CPU, I/O to be released). And developers can do nothing about the cases. Not these codes but rather those blocking/consuming codes must be optimized. In other words: “consumption” indicates how healthy is the code, “duration” indicates how healthy is the system (hardware, surrounding processes etc).

To identify processes blocked by others I created sp_BlockedProcesses (Listing 2). Assign it to the shortcut, say, Ctrl-5, press the buttons and here we go:

BlockedSPIDBlockedBufferBlockingSPIDBlockingBufferwaitresourceBlockedHostname...

-----------

-------------

------------

--------------

------------

---------------

...
21

GetLateTasks

65

GetImage

21

KimN

...
5

SetStatus

65

GetImage

21

JasonC

...

I bet you can recall cases when some simple code was started and a quick reply was expected, but it seemed was hanging. It is quite probable that by pressing Ctrl-5 you'd see what is the matter. 

Sending emails with top consumers

Both sp_ActiveProcesses and sp_BlockedProcesses are instantaneous. Of course a DBA needs overall reports showing top consumers and long-runners. I can share how I organized it in the company I work for.

1. A job was scheduled to run every morning to start a trace on production servers. One of the trace parameters specifies time to stop the trace. The job runs at 8:30 AM and the trace stops itself at 5:30 PM, when the peak of user activity on production servers is over. The name of the executed stored procedure: spTraceBuild (Listing 3).

spBuildTrace is based on a very useful stored procedure build_trace from the Microsoft Knowledge Base Article Q283790. I did the following minor modification:

a. Added server name and current date to the trace file name

b. Added error handling. If a mistake was done (for example, an option value is incorrect or a date to stop the trace has been already expired) and the trace wasn't created, it's nice to get a message.

c. Changed the code to expect only time to stop the trace (instead of date/time) - 5:30 PM in my case. I.e. you don't need ever to modify ActivityTrace.ini. The trace will be always stopped at the specified time on the same day it was started.

spTraceBuild gets configuration data from a text file named ActivityTrace.ini. It contents could be like:

@tracefile=\\Process02\D$\Program Files\Microsoft SQL Server\MSSQL\LOG\Trace
@maxfilesize=15000
@stoptime=5:30
@options=2
@events=10,12
@columns=1,3,12,13,14,16,17,18
@filter1=10, 0, 7, N'SQL Profiler'

Apparently you need to modify at least the first parameter, @tracefile, to make it appropriate.

Two types of events give us consumption numbers:

10 - RPC Completed

12 - Batch Completed

2. Another job was scheduled to run every night to absorb the trace file and process it, i.e. to insert the trace data into a SQL Server table and aggregate the information. Why to collect the data into a file to bulkcopy them afterward into a table instead of collecting them directly into the table? Firstly, collecting trace data into a file works faster, secondly, you cannot run a trace programmatically into a table as you do when starting a trace from Profiler. I created the following aggregations:

- top CPU consumers

- top long-runners

3. The processing job sends an e-mail containing the reports to managers of development. Every morning development managers can find the "Top consumers" report in their Inbox. That is important as long as performance is a serious issue in my company. You can schedule the trace and processing/reporting jobs to run once a week, for example, on Tuesday, if user activity and workload do not differ from day to day.

The name of the processing/reporting stored procedure is spProcessTrace (Listing 4). An important part of the code is a UDF fnExtractSPNameFromTextData (Listing 5). Definitely, you can aggregate data from a trace only if codes running against the server are named codes as stored procedures are. Ad hoc queries will be out of the scope. However, I do not think any interactive, frequently executed codes can be running as ad hoc queries, which would need compilation on the fly. Therefore, all real top consumers should be represented in the report.

Run the sp from a scheduled job as:

EXEC spProcessTraceFile

@ServerName = 'MyProductionServerName',

@ReportDate = null, -- date the trace file was created on (default - current)

@TraceFilePath = '\\MyProcessingServerName\C$\Program Files\Microsoft SQL Server\MSSQL\LOG\',

@recipients = 'Manager1@MyCompany.com;Manager2@MyCompany.com'

The value specified in @TraceFilePath should match to @tracefile in ActivityTrace.ini.

Resulting report for top CPU consumers looks like:

Top Process02 CPU consumers for Feb 22, 2003:

SPTimesExecutedTotalCPUAverageExecTimeMinCPUMaxCPU
------------------------------------------------------------------------------------------
RemoveCoManager7615531879330595062
UpdateProspect1104745174313232829062
AddStandardTaskTime673457651680594829
TaskbyResource24801306845206656
GetAssetTypes53188872016078
SubmitExpenseById158363696400719
BillingRatesByBillingOffices11063164574321312
SessionCleanUp12315609945019406
CheckSummaryTask230164437146110
RollupSummaryTask20715844760281
CreateBatchNumber2720141465032
RejectTime1345133969079
DeleteBillingRole121210810095781390
ProjectSummary143100036915172
GetApprovedInvoices1297678137181032
ProgressProject228832236094
AddSubProject2807875280265
InsertExpense77422106005906
LoadCustomer166953434312688
PercentOfContractIncurred1645790351547
GetTaxes85469683640828
RolesFeatures653308887501016
GetWorkflowTypes246451918078
GetDraftInvoices250443917063

 Consider locations of the codes and files as the following:

spBuildTrace

msdb on a production server

spProcessTrace

DB Traces on a processing server

fnExtractSPNameFromTextData

DB Traces on a processing server

ActivityTrace.ini

C:\Program Files\Microsoft SQL Server\MSSQL\Binn\

on the production server

Activity Graph

The Results Pane of Query Analyzer does not allow us to represent results graphically, but spActivityGraph (Listing 6) challenges this limitation. The stored procedure uses the trace table created by spProcessTrace. spActivityGraph shows how processes running on the SQL Server interfere with each other. Looking at the graph you can see peak-times, concurrency and how this or that process is taking longer than usually being surrounded by a tough company of other processes:

StartTimeDurationText11:3011:3911:4911:59
11:38:4212033InsertExpense

--

11:39:106220 GetAssetTypes

--

11:40:00122810GetRequestsToApprove

--------------

11:40:0652826GetMonthlyRevenue

------

11:40:1130516GetApproverTimes

----

11:40:1630527GetApproverTimes

----

11:40:1730533GetApproverTimes

----

11:40:2530530PopulatePlanFact

----

11:40:2830543ProjectSummary

----

11:40:2830516LoadLeadByResource

----

11:40:3030513ProjectSummary

----

11:40:3611736SetLockAllTask

--

11:40:3821623InvoiceByClient

--

11:40:42103116PopulatePlanFact

------------

11:40:4415780GetDraftInvoices

--

11:40:4910310InsertAd

--

11:40:509513ModifyCodeUpdatedExpense

--

11:40:518280DeleteBillingRole

--

11:40:5960966ProjectSummary

--------

11:41:0430516AutoEscalate

----

11:41:0730446GetLicenceUpdate

----

11:41:215046GetImageBatch

-

 spActivityGraph has 6 input parameters:

@TraceTable-name of the trace table created in spProcessTrace
@TraceStart-start time of the period
@TraceEnd-end time of the period
@SPNameLayoutLen-width of the column with SP names in the report. Default - 20.
@DurationLayoutLen-width of the column with duration values in the report. Default - 6
@LayoutWidth-width of the report. Default - 115 symbols (wrapping would make the picture totally messed up).

I would suggest to build graphs for 30 minutes intervals. If you increase the interval trying to cover the entire day or at least a half of the day in one graph, duration of one unit (shown by dash '-') will be also increased and even processes with duration more than 10 sec will be missed in the graph. What if nonetheless you would like to see the activity for the entire day? No problem: the stored procedure spShowActivityGraphByChunks (Listing 7) will give you the full day picture divided into 0.5-hour pieces. The only 2 mandatory input parameters for the stored procedure (@ServerName, @ReportDate) serve to identify a trace table to work with.

Conclusion

Scripts referenced in the article are available here.

Stored procedures showing instantaneous and overall performance reports give us a clear picture of user activity on production SQL Servers and help us to find ways to make the performance better.

Rate

5 (1)

Share

Share

Rate

5 (1)