Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Ramesh Meyyappan’s SQL Server Performance Tuning Blog

Ramesh Meyyappan (www.sqlworkshops.com) is a SQL Server specialist with expertise in Performance Tuning. Ramesh worked at Microsoft Corporation from 1994 to 2004. Nearly half of that time he worked in Redmond in the development teams - specifically as Program Manager in the SQL Server Development Team responsible for optimizing SQL Server product for SAP. Ramesh now offers onsite and offsite consulting and workshops independently as well as by partnering with various Microsoft Subsidiaries. LinkedIn Profile: http://de.linkedin.com/in/rmeyyappan.

SQL Server IO handling mechanism can be severely affected by high CPU usage

Are you using SSD or SAN / NAS based storage solution and sporadically observe SQL Server experiencing high IO wait times or from time to time your DAS / HDD becomes very slow according to SQL Server statistics? Read on… I need your help to up vote my connect item – https://connect.microsoft.com/SQLServer/feedback/details/744650/sql-server-io-handling-mechanism-can-be-severely-affected-by-high-cpu-usage. Instead of taking few seconds, queries could take minutes/hours to complete when CPU is busy.

To read additional articles I wrote click here. To watch the webcasts click here.

In SQL Server when a query / request needs to read data that is not in data cache or when the request has to write to disk, like transaction log records, the request / task will queue up the IO operation and wait for it to complete (task in suspended state, this wait time is the resource wait time). When the IO operation is complete, the task will be queued to run on the CPU. If the CPU is busy executing other tasks, this task will wait (task in runnable state) until other tasks in the queue either complete or get suspended due to waits or exhaust their quantum of 4ms (this is the signal wait time, which along with resource wait time will increase the overall wait time). When the CPU becomes free, the task will finally be run on the CPU (task in running state).

The signal wait time can be up to 4ms per runnable task, this is by design. So if a CPU has 5 runnable tasks in the queue, then this query after the resource becomes available might wait up to a maximum of 5 X 4ms = 20ms in the runnable state (normally less as other tasks might not use the full quantum).

In case the CPU usage is high, let’s say many CPU intensive queries are running on the instance, there is a possibility that the IO operations that are completed at the Hardware and Operating System level are not yet processed by SQL Server, keeping the task in the resource wait state for longer than necessary. In case of an SSD, the IO operation might even complete in less than a millisecond, but it might take SQL Server 100s of milliseconds, for instance, to process the completed IO operation. For example, let’s say you have a user inserting 500 rows in individual transactions. When the transaction log is on an SSD or battery backed up controller that has write cache enabled, all of these inserts will complete in 100 to 200ms. With a CPU intensive parallel query executing across all CPU cores, the same inserts might take minutes to complete. WRITELOG wait time will be very high in this case (both under sys.dm_io_virtual_file_stats and sys.dm_os_wait_stats). In addition you will notice a large number of WAITELOG waits since log records are written by LOG WRITER and hence very high signal_wait_time_ms leading to more query delays. However, Performance Monitor Counter, PhysicalDisk, Avg. Disk sec/Write will report very low latency times.

Such delayed IO handling also occurs to read operations with artificially very high PAGEIOLATCH_SH wait time (with number of PAGEIOLATCH_SH waits remaining the same). This problem will manifest more and more as customers start using SSD based storage for SQL Server, since they drive the CPU usage to the limits with faster IOs. We have a few workarounds for specific scenarios, but we think Microsoft should resolve this issue at the product level. We have a connect item open – https://connect.microsoft.com/SQLServer/feedback/details/744650/sql-server-io-handling-mechanism-can-be-severely-affected-by-high-cpu-usage - (with example scripts) to reproduce this behavior, please up vote the item so the issue will be addressed by the SQL Server product team soon.

Comments

Leave a comment on the original post [blog.sqlworkshops.com, opens in a new window]

Loading comments...