SAN and NAS protocols and how they impact SQL Server

  • Comments posted to this topic are about the item SAN and NAS protocols and how they impact SQL Server

    Gregory A Jackson MBA, CSM

  • One difference which I had always assumed (and may be wildly innacurate!) is that NAS communication is all through the Network Stack (TCP) where SAN is through the DIsk I/O stack, though this may be sadly out of date now that iscsi has come to the party.

  • You can now get a 20 TB SAN with high-end FC disks and integrated NAS appliance from the EMC for < 100 KEuros.

    I think that if you use your database the same way you watch a movie (like reading sequentially some 10 GB files) throughput is the only point, and there will be no big deal between 8Gb FC and 10Gb ethernet.

    If your database load is transactional, with many users performing short transactions (leading to 1KB synchroneous writes), latency will be the point.

    As long as you hit the SAN's write cache (thus minimizing pure physical disk performance, and focusing on protocol and adapter overhead) you'll get at least 20,000 KIOps out of your FC SAN, which is out of reach by any 10Gb iSCSI).

    If you reproduce a logfile activity with SQLIO (1KB unqueued, synchroneous, sequential writes to a 1 GB file, which will fit your SAN's write cache) you'll get, say 30 micro-second latency on 8Gb FC, vs. 200 micro-second on 10 Gb iSCSI (not even talking about NFS ;).

    This latency is more or less the time taken by every _short_ transaction. Think about it if you cannot change an application that keeps over-committing onto your database.

    http://www.demartek.com/Reports_Free/Demartek_LSI_CTS2600_Evaluation_2011-10.pdf

  • Netapp/ZFS's ISCSI/FC support is done via block emulation. Let say you create a 100 GB iSCSI LUN but in backend it is actually a 100GB file and a process emulates iSCSI/FC protocol on the file (you can do that in Linux too).

    However it doesn't mean it is slow. As far as I know emulation is built-in to the kernel. Secondly Netapp appliance has a 512 MB NVRAM with battery installed. It is used to help accelerating write performance. Thirdly operating on file level you can have a lot snapshots (WAFL/ZFS both are copy on write File System). In addition Netapp/ZFS has cool some SSD tricks that can enhance IOPS (ZFS has ZIL and L2ARC, Netapp has SSD PAM).

    IMO EMC VNX is heading right direction but on backend it is still two different OS (DartOS for NAS and Clariion for SAN). I wonder when they can merge them into a single OS.

  • One of the main benefits of a SAN is that if you have homogeneous hardware servers, you can use free Xen to employ hot-failover for Virtual Machines. I've seen a lot of hype over the power of the Xen Server's abilities on this front, requiring a shared SAN.

    This can be creeping death for the unwary DBA.

    SAN gives fast, redundany, off-server, snapshottable shared storage. (See NetAPP)

    + XEN can give us the ability to have super-fast redundant failover using the SAN

    + SQL Server can be installed on a Virtual Machine

    = Put the above together, and you have what looks like a 'good' redundant system.

    Sadly, from tests with SQL on XEN+SAN, the I/O throughput is surprisingly bad. You must over-supply your servers with RAM to cope with the poor I/O, and you must provide more processor cores.

    Running a production Server in the above way may seem like a good choice, but the cost is prohibitive, and the performance for the high-IO system disappointing. This becomes doubly-true if your sys admins see the expensive NetAPP SAN with all that unused space, and throw up a few more VMs, also doing I/O using the same disk set. Caveat NASor 😉

  • Such HA features are also available on VMware with quite good throughput. Low latency may be obtained with Raw Device Mapping, but I'm not sure if VMotion works with RDM.

    Think of partition alignment.

  • I am in the process of moving production databases over to a NetApp 2040 SAN and I can commiserate with "death by documentation". I've spent weeks rummaging around on NetApp's support site and endless oceans of PDFs in order to wrap my head around all the aspects that SAN technology brings to the SQL Server party. We are a small shop and I do not have the luxury of a full time SAN admin. I will stress that educating yourself and testing the configuration are completely necessary, deviate from NetApp best practices and all bets are off in my experience.

  • If you can spare some time, how about give ZFS with Nexenta a try? Just make sure you go over zfs best practices guide and ZFS evil tuning guide.

    And don't use dedupe unless your system has a lot of ram.

  • Hi Gregory Jackson,

    Regarding SAN and NAS, we are facing a prd issue. Will u please help me out in this.

    Scenario:

    During design layout we have planned to create a table t1 (with 5 colums) 4 colums @ SAN and 1 Column Img1 and it related Indexes on NAS (Because this columns will hold the Image data).

    During the course of time, while this NAS went down, the cluster manager does the fail over from active / passive server. During the change over the fail over server cannot recognize the NAS was down. Due to this the prd applicaiton at 300 and more sites (24/7 operational database) cannot access the central server. Due to the Img1 Column sitting on NAS the SQL Server 2008 R2 cannot come up. This was a SEV1 ticket.

    To avoid the scenario i suggested the project to go for SAN.

    Now they have proivded SAN as well.

    My environment is using Merge and Transaction replication. There is no down time for this environment.

    The NAS Img1 Column in FileStreamGroup1.

    For trail run in Dev env:

    Created a New Table T2 and also created a New FileStreamGroup2 and set this as default.

    After populating the data from table t1 to t2, i cannot rename the table because it is in the replication env table.

    What is better solution.

    There is now way attach / detach / offline etc.,

    Thanks in advance

    Best regards

    Santhosh

  • Pardon me while I vent while we're on the topic of SANs: I've discovered there are good SAN admins and bad SAN admins. The bad ones say 'SAN is teh fastest evar! You are forbidden to have stoopid local disk!' Good ones say 'How can SAN best fit into your storage requirements?' I've worked with both... I feel sorry for everyone who has to deal with the bad kind.

  • While much of this article is accurate, I need to correct a few misconceptions.

    I am a PM, working on the storage engine for SQL Server, specifically on access to storage devices. Over the past couple of years, I've been working with the Windows File Server team to see if there was anything we could do about the poor performance of SQL running over SMB.

    As it turns out, there was. It seems that SQL had never paid attention to NAS because nobody used it, because the performance sucked. Windows didn't pay attention to workloads like SQL because again, nobody used it, so it wasn't on their radar. We got the File Server team using TPCC as one of their performance benchmarks for the file server, and some interesting things came to light.

    The first was a bug which has been in the client side of the protocol stack since Windows 95! It effectively serializes all write traffic. When we started out, SMB was at ~20% of the performance (as measured by TPCC) compared to direct attached storage. By the time we were done, the performance over SMB was at 97% of the performance of direct-attach.

    The great thing about this is that it is a client-side fix, which has been released as a Windows QFE to Windows Server 2008 R2, and I believe 2008 as well. Because it's client-side, you get the benefit regardless of what NAS device is on the other end of the wire.

    Going beyond that, the File server team has incorporated the RDMA protocol, in which network traffic goes directly from the wire into user memory without being copied to kernel space. This dramatically reduces CPU consumption, which can suck up most of your CPU, even in iSCSI environments. Now, the CPU used by the TCP stack is comparable to Fibre Channel.

    There are also a collection of significant enhancements to the protocol, moving the protocol from 1.0 to 2.2.

    The Windows architects have published all of these changes, and have been sitting with all of the major NAS vendors, to make sure that they can also take advantage of these improvements.

    Bottom line is that just within the past year or so, the landscape has significantly shifted, so that now it is possible to put together a NAS-based configuration which performs on par with a SAN.

    Kevin Farlee

    SQL Server Storage Engine PM

  • Gregory,

    I work in a mid-sized company and have the luxury/challenge of being both the DBA and Storage Admin. We also use Netapp Storage mostly with FC in one datacenter and iSCSI in another. When you decide to move your databases to iSCSI LUNs here's my favorite way of doing it. You can move the datafiles ONLINE! I've used this method several times to move entire datafiles to new LUNs as an online operation.

    1. Create the new LUNs and attach them to the SQL Server using iSCSI or FC

    2. Add datafiles to the existing filegroup (create SQL data files on the freshly attached iSCSI or FC storage)

    3. Empty the existing data file into the newly attached one. This is an online operation. "DBCC SHRINKFILE(<FileName>, EMPTYFILE)"

    4. Voila! The datafiles have now been migrated to iSCSI or FC. 🙂

    4. Remove the old (now unused) files from the SQL filegroup.

    Brad Hoff, @SqlPhilosopher, has created a blog post describing the method in detail: http://www.sqlphilosopher.com/wp/2012/02/moving-a-filegroup-to-a-new-disk-array-with-no-downtime

    That will not work for is your transaction log file. For that:

    1. CHECKPOINT

    2. Take a tlog backup

    3. Shrink the transaction log (This will make the copy time for the log file minimal)

    4. Run ALTER DATABASE MODIFY FILE to change the location of the ldf file.

    5. Take database offline.

    6. Move the physical file from old location to new location.

    7. Bring database online.

    8. Expand the log file back to its original size.

  • Kevin,

    thanks so much for the information....

    Keep up the good work on that Engine

    🙂

    Greg J

    Gregory A Jackson MBA, CSM

  • James,

    Thanks for reaching out....I really appreciate the info.

    GAJ

    Gregory A Jackson MBA, CSM

  • I found this to definitely be true

    GAJ

    Gregory A Jackson MBA, CSM

Viewing 15 posts - 1 through 15 (of 27 total)

You must be logged in to reply to this topic. Login to reply