Over the past 10 or 15 years as I have worked in various technical organizations, I have been faced with debating the pros and cons of various storage devices and how they impact the performance of SQL Server.
In my current role as Manager of Software Engineering, I began to question the use of our NetApp FAS-3160 storage device for our SQL Server installations. I began to see disk queueing and latency in our production environment. Like any other SQL Architect worth their salt, I immediately blamed the IT guys and their IO device. I met with our IT Team and with Engineers off site at NetApp to discuss my concerns and to discover any ways that we could potentially improve our SQL performance and this is what I learned.
In the world of enterprise storage appliances, devices are basically segregated in to 2 categories: Network Attached Storage (NAS) Devices and Storage Area Network (SAN) devices. There is a huge difference between these two types of appliances and what their intended purpose is and in how they store and acess data.
The NAS and "File Level Storage"
A NAS is intended to provide simple, inexpensive file level storage for an organization (Essentially a large file share). A NAS accesses the data stored on it by the file name and byte offsets. Although, you can get good IO performance out of a NAS, the primary intent of a NAS is not to provide screaming performance at the IO Level. The primary purpose of a NAS is to provide bulk disk storage at a reasonable price for storing files. The NAS is exposed to users over the network and typically uses Ethernet and TCP\IP based protocols to communicate back and forth between client and host (NFS or SMB/CIFS).
The prices for a NAS can vary widely depending on the amount of storage and the level of performance required, but typically you will see these devices being in the range of hundreds of thousands of dollars. I believe the ones Ive worked with in the past and present have been in the $250k - $300k range.
The SAN and "Block Level Storage"
The SAN is intended to be seen (and accessed) as a regular disk drive to a server. SAN devices read and write data as raw data blocks, not by file name and offset. The intent of a SAN is to provide high performance storage for the most demanding, IO Intensive workloads in an organization. SAN devices are not intended to be a low cost storage solution. SAN devices typically use fiber channel (FC, FCoE) or iSCSI connections to communicate between the client and the host.
Due to the high performance requirements of the typical SAN devices, these appliances tend to be extremely expensive. Typically, SAN hardware will range in the millions of dollars. The SAN appliances I have worked with in the past have been well over $1 million.
Death By Documentation
As I began researching our NetApp device, the way we were using it, and the way we had it configured, I got more and more confused. The first thing I found when I "googled" the FAS-3160 was that it supports BOTH NAS and SAN protocols. In other words, it supports both file level and block level storage. I didnt understand how this could be true.
As I dug deeper and deeper into the documentation, I was completely baffled. I had to ask, "Is this thing a NAS or is it a SAN?" The answer, as I found from research, was "YES"; it's both actually.
So, although confusing, this was good news to me as this meant that we could configure the NetApp to use either NFS or iSCSI protocols. I continued my research and the documentation on NetApp, suprisingly enough, suggested that there was negligible difference in IO performance from one protocol to another. The documents were saying that regardless of whether you used, NFS, SMB/CIFS, FC, FCoE or iSCSI, the IO performance was going to be roughly the same. This didnt make any sense to me as I had always read and had always been lead to believe that the performance of SQL Server on a NAS was terrible whereas SQL Server performance on a SAN was smoking fast.
As it turns out, our NetApp was configured to use NFS. More on this later.
Microsoft "Support" of File Level Storage (NFS)
In continuing my research, I read many articles and forum postings regarding the use of NFS for SQL Server. Most data suggested that Microsoft did not support NFS (Or File Level Storage) for SQL Server. Turns out, they do support it, but they only started doing so recently (most likely due to industry pressure). File Level Storage was not supported in SQL Server until SQL Server 2008 R2. As of SQL Server 2008 R2, SQL Server now "Supports" File Level Storage. However, even though SQL Server now supports NAS technology, most folks will tell you that it is still not recommended for a high performance SQL Server environment.
IO is NOT IO !!!
When visiting on site with the NetApp folks, I was able to finally clarify the confusion I had regarding the numerous IO performance documents that I had read. NetApp Engineers explained to me that the published numbers simply compare one protocol to another in raw IO statistics. The published numbers do NOT take SQL Server behavior into consideration. SQL Server, itself, wants (demands) Block Storage. So, even though, in raw IO, the various protocols might perform similarly, in a SQL Server Environment specifically, iSCSI can perform as much as 40% better than NFS. Whew, things were starting to make sense.
So, what did I learn? Quite a bit actually. First of all, I learned that technology changes over the past few years have created a situation in which NAS and SAN devices are converging. This is very apparent by the fact that our Appliance can be configured to host multiple logical drives each with various protocols. In other words, I can host a few drives using NFS and a few other drives using iSCSI all on the same appliance at the same time (this is pretty cool).
Furthermore, I learned that years ago, the IO performance of a SAN compared to a NAS was immensely different due to the bandwidth restrictions of the Ethernet cards at the time. Historically, Ethernet connections were limited to 1GBps due to the NIC Cards. SAN technologies using Fiber Channel could blast out IOPs at 4GBps. Because of this, the SAN was almost always superior in raw IO bandwidth. However, these days, we can get Ethernet cards that can support up to 10gbs so that limitation has been mostly eliminated.
Even though, the various protocols now have the ability to support the same IO throughput, SQL Server itself is expecting block storage. So, by exposing block storage for your SQL installations, you optimize performance dramatically regardless of what IO the device itself is capable of handling.
As we as a company grow, we now know that we will likely need to reconfigure our NetApp to use the iSCSI protocol for our SQL Environment. Fortunately for us, this is not an emergency. SQL Server 2008 R2 does support NFS so we dont have to make this migration immediately. As IO performance becomes a major bottleneck, we will implement the necessary changes. Until then, I can rest easy.
I read so many documents that it didnt make sense to list them all, but I did want to give credit to a couple of the most useful resources.
Block Level Storage vs File Level Storage - http://www.techrepublic.com/blog/datacenter/block-level-storage-vs-file-level-storage-a-comparison/3766
SAN Vs NAS - Whats the difference - http://searchoracle.techtarget.com/tip/SAN-vs-NAS-What-s-the-difference
TechNet: Physical Database Storage Design - http://technet.microsoft.com/library/Cc966414