SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

SQL Man of Mystery

Wes Brown is a PASS chapter leader and SQL Server MVP. He writes for SQL Server Central and maintains his blog at http://www.sqlserverio.com. Wes is Currently serving as a Senior Lead Consultant at Catapult Systems. Previous experiences include Product Manager for SQL Litespeed by Quest software and consultant to fortune 500 companies. He specializes in high availability, disaster recovery and very large database performance tuning. He is a frequent speaker at local user groups and SQLSaturdays.

Fundamentals of Storage Systems, IO Latency and SQL Server

A Thousand Men Marching Still Only March As Fast As One Man.

la·ten·cy - Computers . the time required to locate the first bit or character in a storage location, expressed as access timeminus word time.

Often when talking to people about performance they get rapped around the MB/Sec number and ignore a critical factor, latency. SQL Server is one of those few applications that is very sensitive to disk and network latency. Latency is what the end user sees. If your SQL Server is waiting around for disk or network they will start to complain. In an OLTP environment SQL Server accesses data on disk in a nonlinear fashion, aka random IO’s. The size of these IO request can be pretty small. In a good application you really try and limit the amount of data returned to keep things speedy. The down side of small Random IO’s is the system will never be faster than a single seek operation on your disk. So, if you have a 15k SAS drive that is around 2.5ms. Caching and buffering schemes aside for now, 2.5ms is your floor. You will never be faster than that. Depending on the size of the IO request you spend more time waiting for the seek operation than you do actually transferring the data from the disk. We group disks together in larger arrays to give us more aggregate throughout and higher operations per second, but you are ever only as fast as your slowest disk. You can always get more aggregate throughput up to several gigabytes a second but you still have to wait on that first bit to travel to you.

To get around these short comings, SQL Server buffers as much data as it can in memory. We also add large amounts of cache to our SAN’s and disk controllers. This can mask some of the problem, but at some point it still needs to come from your disk drives.

On the network side things are actually better. With latency under a millisecond on a LAN you usually waiting on disk to deliver the data. There are other factors like the speed of the network equipment and number of hops across interfaces you have to make can be more significant than the actual transmittion rate. TCP/IP can be a factor as well. Out of the box SQL Server is configured at 4KB(4096 byte) packet. This is a good general setting for most workloads. If you are working on a highly tuned OLTP system you may want to set that to something smaller or align it with the TCP packet size for your network, usually 1500 bytes. If it is a OLAP system with lots of streaming throughput latency will make up a very small part of the overall transmission time and going with a larger packet size possibly aligned to the 8KB page size with increase throughput and decrease the time to transmit overall. If you do set a large packet size you should consider enabling jumbo frames on your network card. Make sure that your network equipment can support the jumbo frame from end to end.

Another place where we are starting to see more latency issues is with database mirroring. In synchronous mode, the default setting, you are now adding network latency plus the disk latency of the other server to the overall transaction time.

Mirroring isn’t the only game in town. We have had SAN level replication for quite a while as well. In most of the scenarios where we were using SAN level replication it was site to site across several miles. To give you an idea of how much latency can be added in this situation go ping yahoo.com or google.com, I’ll wait….. Ok, from my location ether of them is 45ms~75ms, or 18 times slower than your spinning disks. All the sudden, your network is the major player in delaying transactions. I’ve used fibre optics to connect to sites and the latency can still be a killer for OLTP systems. The general rule of thumb is 7.5 microseconds for every  1 1/2 miles. If our target SAN is 125 miles away we just added 2ms of latency to the 4ms of latency the two sets of disks are providing. In reality, it is worse than that when you again figure in network equipment. I personally haven’t seen synchronous setups more than 50 miles apart.

Just something to keep in mind as you plan your SQL Server infrastructure. Latency in its myriad forms is king.

Series To Date:
  1. Introduction
  2. The Basics of Spinning Disks
  3. The System Bus
  4. Disk Controllers, Host Bus Adapters and Interfaces
  5. RAID, An Introduction
  6. RAID and Hard Disk Reliability, Under The Covers
  7. Stripe Size, Block Size, and IO Patterns
  8. Capturing IO Patterns
  9. Testing IO Systems
  10. Latency – You are here!


No comments.

Leave a Comment

Please register or log in to leave a comment.