Recently, I was asked to do an extensive sql server performance audit and review. I will be sharing some of the information, scripts and documents that I referenced when I completed this thorough analysis. One of the most important things in performance tuning, is the knowledge that goes along with best practices on SQL Server Waits and Queues. The article is quite in-depth, but contains a treasure trove of valuable performance tuning information, especially with respect to waits and queues.
In this whitepaper, you will find examples and explanations of every kind of Wait Type, and how to correlate it to a particular performance resource issue or bottleneck. You will readily be able to identify the cause of these waits, and learn how to optimize the performance of your sql server. A user typically experiences poor performance as a series of waits. Whenever a SQL query is able to run but is waiting on another resource, it records details about the cause of the wait. These details can be accessed using the sys.dm_os_wait_stats DMV. One can examine the accumulated cause of all the waits, based on the output of the DMV, as displayed below:
So, just to take the first wait type that shows up as the highest % of waiting, on the server, CXPACKET.
This wait occurs when trying to synchronize the query processor exchange iterator. Consider lowering the degree of parallelism if contention on this wait type becomes a problem. Parallel process waits can sometimes occur when data is skewed. In such cases, one parallel thread may process a larger number of rows while another may process a smaller number of rows and so on.
In an OLTP environment, excessive CXPACKET waits can affect the throughput of other OLTP traffic. In a data warehouse environment, CXPACKET waits are expected for multiple proc environments.
Therefore, for OLTP workloads, you may want to consider limiting parallelism by setting max degree of parallelism to some number less than the total number of CPUs, and other than the default '0', meaning ALL processors. Please see the document for definitions on other wait types.
In another example, I already authored some articles on IO usage, statistics, and scripts. High I/O usage alone is not necessarily an indication of a problem, or I/O pressure. In fact, IO_stalls, is the total cumulative time, in milliseconds, that users waited for I/O to be completed on the file since the last restart of SQL Server. One DMV, sys.dm_io_virtual_file_stats provides a breakdown of SQL Server reads, writes, and io_stalls for a particular database or transaction log file.
There is a complete section dedicated to IO_Stalls, which is one type of wait that can help identify performance bottlenecks. However, this is just a tiny bit in a large pool of useful information contained in the doc.
Here's a summary of this document, which can be downloaded directly from Microsoft, by clicking on the link below.
SQL Server 2005 Performance Tuning using the Waits and Queues is a proven methodology that allows one to identify the best opportunities to improve performance, the so called “biggest bang for the buck”. These performance improvements are likely to have a significant return on the performance tuning time investment. The methodology helps identify the areas of slow performance by looking at the problem from two directions called Waits and Queues. An analysis of Waits indicates where SQL Server is spending lots of time waiting. In addition, the biggest waits point out the most important or relevant Queues (that is, Performance Monitor counters and other data) for this workload. The cross validation of the waits analysis enables us to eliminate all except the most significant performance counters, and provides a strong indication of pressure on specific resources. Application performance can be easily explained by looking at SQL Server waits and System or Resource queues. In SQL Server 2005, the dynamic management view (DMV) sys.dm_os_wait_stats provides a valuable source of wait information from an application perspective. The system or resource perspective uses Performance Monitor counters and other data sources to provide a breakdown of system resource usage according to resource queues. Taken together, the value of the application and resource perspectives used by the waits and queues methodology enables us to narrow in on the bottleneck and eliminate the irrelevant performance data. In sum, Performance Tuning using the Waits and Queues methodology is an effective way to quickly identify and resolve application performance problems because it lets the user discover new and potentially unexpected problem areas, within applications or solutions without the typical guesswork that can accompany such work. Performance tuning of applications and solutions has been around for many years. The performance of SQL Server 2005 database applications should be evaluated from several different perspectives. Each perspective tells a different section of the complete performance story. Together they paint a detailed performance picture of the whole and also cross validate observations in each specific perspective. Although, the document was drafted for SQL Server 2005, it is certainly applicable to SQL Server 2008, (some new DMV's may have been introduced). Without further adieu, to download this whitepaper entitled, SQL Server 2005 Waits and Queues - SQL Server Best Practice Article, authored by Tom Davidson, updated by Danny Tambs, and technically reviewed by Sanjay Mishra (full accreditation goes to these gentlemen), goto: http://msdn.microsoft.com/en-us/library/cc966413.aspx --> Scroll down to the bottom to get the doc. I hope your internet connection is good and doesn't keep you WAITing ;-)
SQL Server 2005 Performance Tuning using the Waits and Queues is a proven methodology that allows one to identify the best opportunities to improve performance, the so called “biggest bang for the buck”. These performance improvements are likely to have a significant return on the performance tuning time investment. The methodology helps identify the areas of slow performance by looking at the problem from two directions called Waits and Queues. An analysis of Waits indicates where SQL Server is spending lots of time waiting. In addition, the biggest waits point out the most important or relevant Queues (that is, Performance Monitor counters and other data) for this workload. The cross validation of the waits analysis enables us to eliminate all except the most significant performance counters, and provides a strong indication of pressure on specific resources.
Application performance can be easily explained by looking at SQL Server waits and System or Resource queues. In SQL Server 2005, the dynamic management view (DMV) sys.dm_os_wait_stats provides a valuable source of wait information from an application perspective. The system or resource perspective uses Performance Monitor counters and other data sources to provide a breakdown of system resource usage according to resource queues. Taken together, the value of the application and resource perspectives used by the waits and queues methodology enables us to narrow in on the bottleneck and eliminate the irrelevant performance data.
In sum, Performance Tuning using the Waits and Queues methodology is an effective way to quickly identify and resolve application performance problems because it lets the user discover new and potentially unexpected problem areas, within applications or solutions without the typical guesswork that can accompany such work.
Performance tuning of applications and solutions has been around for many years. The performance of SQL Server 2005 database applications should be evaluated from several different perspectives. Each perspective tells a different section of the complete performance story. Together they paint a detailed performance picture of the whole and also cross validate observations in each specific perspective.
Although, the document was drafted for SQL Server 2005, it is certainly applicable to SQL Server 2008, (some new DMV's may have been introduced).
Without further adieu, to download this whitepaper entitled, SQL Server 2005 Waits and Queues - SQL Server Best Practice Article, authored by Tom Davidson, updated by Danny Tambs, and technically reviewed by Sanjay Mishra (full accreditation goes to these gentlemen), goto:
http://msdn.microsoft.com/en-us/library/cc966413.aspx --> Scroll down to the bottom to get the doc. I hope your internet connection is good and doesn't keep you WAITing ;-)
OK, there was something that Microsoft did to tick me off. SQL Server 2005 introduced us to so many exciting and useful DMVs, such as sys.dm_os_sys_info. This DMV, which would give us a set of useful information about the computer, and about the resources available to and consumed by SQL Server, SQL 2008 came about, and already there was a change here.
One of the very useful things that we were able to derive is CPU usage. This was calculated based on the cpu_ticks and cpu_ticks_in_ms columns. The T-SQL statement that gave us the base value to successfully determine SQLProcessUtilization, OtherProcessUtilization and SystemIdle originally looked like this:
declare @ts_now bigint
select @ts_now = cpu_ticks / convert(float, cpu_ticks_in_ms) from
sys.dm_os_sys_info
However, a funny thing happened when applying the same t-sql logic in SQL Server 2008. The cpu_ticks_in_ms column is no longer present in the sys.dm_os_sys_info DMV in SQL Server 2008. Microsoft determined that this column is no longer accurate, and so, they removed it. This of course caused great consternation among the SQL Server DBA Community. Cries across the internet and sql forums everywhere, folks asked MS to please put it back! The cpu_tickes_in_ms column originally returned the number of CPU ticks in milliseconds. Due to an imprecision in this value, they could not guarantee the accurate measure. Microsoft posted the following explanation:
‘SQL 2005 used the value presented in cpu_ticks_in_ms as a conversion factor for high resolution timing. The time source used ticked at the same speed as processor instructions. SQL 2005 calculates the value assuming that CPU frequency is constant and ever increasing, however there is no accurate way to determine how many CPU ticks may actually occur in a millisecond. Current hardware that can change CPU frequency in power management modes breaks this assumption. SQL 2008 has moved to a more reliable time source, and as such no longer attempts to determine CPU frequency.’
The irony in this whole thing is that the most popular application of this DMV calculation was in their own code release of the SQL Server Performance Dashboard (SQL 2005 SP2). The stored procedure MS_PerfDashboard.usp_Main_GetCPUHistory, would fail on SQL Server 2008 with the following error:
Msg 207, Level 16, State 1, Procedure usp_Main_GetCPUHistory, Line 6Invalid column name 'cpu_ticks_in_ms'.
So, what’s a DBA to do to get the CPU utilization stats for SQL Server 2008? I didn’t want to revert back to using some WMI\VBScript code, which I found slow, and unreliable (pre-2005), and a process outside SQL Server.
After some intense scouring of the internet, and close examination of the new sys.dm_os_sys_info DMV in SQL Server 2008, I noticed the ms_ticks column, present in SQL 2005, was still there in SQL 2008. This column returned the number of milliseconds since the computer was started.
According to one article I found, ‘The first column (cpu_ticks_in_ms) remained (mostly - within 0.000001% variance which may be due to rounding errors in ms_ticks) constant (and surprise, it was roughly the CPU frequency in ms - number of CPU ticks per millisecond). Note this happened in both SQL 2k5 and 2k8. In SQL 2k5, we also added cpu_ticks_in_ms to the output and again it was within 0.000001%’
OK, close enough. By making a relatively simple modification, by applying the new logic using these columns, we can get the value using this T-SQL Code:
select cpu_ticks / ms_ticks as ratio_ticks_in_ms, cpu_ticks, ms_ticks from sys.dm_os_sys_info
The above code works on both SQL Server 2005 and SQL Server 2008. Now let’s put it all together, and get some real understandable statistics by running the following code:
set nocount on declare @ts_now bigint
select @ts_now = cpu_ticks /( cpu_ticks / ms_ticks )
/*cpu_ticks / convert(float, cpu_ticks_in_ms)*/ from sys.dm_os_sys_info
select top 1 record_id,dateadd(ms, -1 * (@ts_now - [timestamp]), GetDate()) as EventTime,
SQLProcessUtilization,SystemIdle,100 - SystemIdle - SQLProcessUtilization as OtherProcessUtilization
from (select record.value('(./Record/@id)[1]', 'int') as record_id,
record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') as SystemIdle,
record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') as SQLProcessUtilization,
timestamp from (select timestamp, convert(xml, record) as record
from sys.dm_os_ring_buffers
where ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR' and record like '%<SystemHealth>%') as x
) as y order by record_id desc
Here you will get the output in a most readable format, suitable for realtime monitoring and reporting:
EventTime SQLProcessUtilization SystemIdle OtherProcessUtilization
2009-9-29 08:26:30:00.603 2 96 2
In addition to CPU Usage, you can get all sort of other great information, readily available in the sys.dm_os_sys_info DMV, including the Number of logical CPUs on the system, Ratio of the number of logical and physical processors, Amount of physical memory available, and more! (Take a look at BOL, or here for more on sys.dm_os_sys_info).
By now you know, I was using a play on words in my title, and the tick, in ‘tick me off’ was all about CPU Ticks in SQL Server’s DMV.
---------------------------------------------------------------------------------------------------------------------
For Remote DBA Services and SQLCentric monitoring, visit us at http://www.pearlknows.com