Average Disk Queue Length

  • Hi,

    We have a SQL server which we recently hooked up to a SAN. We have still to migrate data to the SAN, the SQL is still running with the old drives, but I have seen a big big increase in the above tnning parameter, in fact it stays at 100 all the time, as a result the performance has got bad.

     

    I have another SQL server which also is in the same scenario, but there I do not see any change in this parameter and the perfoemance is good on the server.

     

    Is there anything in the setting-up of the SAN that resulted in this problem, for bringing the server on the SAN we added 2 HBA cards on the server which is linked through Fibre Channel to the SAN.

    Did anyone have a similar experience. Any help is highly appreciated.

     

    Regards

    rama

  • We had a very similar situation when we migrated to a new config on a SAN. The new system was 4 times slower, 8 hours vs 2 hours, than the old system doing a DBCC DBREINDEX on a large table. We were only able to get decent performance after we striped the SAN volumes at the Windows level.

    I suspect we have a SAN config problem at this site as the same job runs in 80 minutes on a third system on a different SAN. In this case we are not striping the SAN volumes, just spanning.

    Go figure.

    Good luck.

  • Check your HBA's are configured for max speed, probably 2 gig.

    Use Load Balancing software like power path to distribute the traffic over the HBA's.

    Use as small as possible meta's when creating the drives. We originally had 4x8gig, we now have 8x4gig and the performance improvement was significant, about 50% better. You can then stripe these drives at the OS.

    Use WIN2K3 it has optimised drivers for connection to the SAN.

    Check the sector alignment, we had to create a 4k block at the beginning of the disk to align the sectors.

    Format for 64K block size.

    We had similar problems as you appear to have and we resolved them, but it took about 2 weeks.

    The vendors of our SAN were supposed to be the experts, they told us the disk cache (32gig) would be the answer to performance problems. They were wrong.  You need to get the best performance out of the disk and then let the cache help.  We now get 122mb/sec when we backup a database, we only got 68mb/sec the way they initially set it up.

    Good Luck

     

  • We are having similiar problems with our HP/Compaq san. 

    Mark,

    -----Use Load Balancing software like power path to distribute the traffic over the HBA's.

    You are referring to the software that comes from Compaq or is this a third party item? The only thing i see is Secure path.

    ----Use as small as possible meta's when creating the drives. We originally had 4x8gig, we now have 8x4gig and the performance improvement was significant, about 50% better. You can then stripe these drives at the OS.

    Are you referring to using the most spindles as possible here?

    ----Check the sector alignment, we had to create a 4k block at the beginning of the disk to align the sectors.

    Huh? I've never heard of this, but are you referring to the SAN or within windows?

    Thanks for the help!

     

  • Power path is an EMC product, compaq probably has there own version of it.  It effectively balances the traffic across all the fibre channel paths to the SAN

    The disks in the SAN are 256gig each, but obviously we wouldn't want to put one database on one drive as the performance wouldn't be all that good. What we can do is split these disks into 64x4gig chunks and these chunks are stitched back together with other drives (more spindles). We needed 32gig drives which we could stripe together at the OS to give us 64,96,128 gig etc OS logical drives. This 32 gig  size was the best fit for our databases and gave us the least amount of wasted disk space (some databases were only 20 gig, which leaves 12 gig on the volume that we probably won't use). We were initially recommend the 4x8gig configuration by the vendor, but the performance wasn't there.  Try doing a backup of a large database and check the throughput from the output completion mesage in query analyser. It gives you a mb/sec throughput. We average 122mb/sec and I have seen it as high as 144mb/sec on our SAN.  You may need to backup up another big database first as you need to flush the cache on the SAN.  We have a 32 gig cache on the SAN. If I do two backups of a 12 gig database in a row the first ones throughput is 122mb/sec but the second is around 300mb/sec as the database is sitting in cache and the backup operation doesn't read from disk. You also have to be careful you aren't getting disk queueing when writing the backup file to disk as this will slow down the reading of disk.  We got around this using the Litespeed backup utility by dbassociates, as it compressed the backup file to 25% its normal backup size.

     

    Check out this link for sector/partition alignment

    http://www.microsoft.com/windows2000/techinfo/reskit/en-us/default.asp?url=/windows2000/techinfo/reskit/en-us/prork/pree_exa_oori.asp

     

    good luck

  • Thanks Mark. So if I understand you correct, i should have the SAN broken up into various smaller sizes. ie. for us, we're using 72GB drives and so what 36GB chunks x let say 10.  So that would be 32 x10 or about 5 drives. Under the OS, this should be recognize as 10 36GB drives.  I would use the OS to stripe them together as raid 5 or whatever i want.  I checked out the sector info. Thanks that was great, now I just need a system to test this on. Scary....

    Let me know if this is okay. Thanks again Mark! Much appreciated!

  • If you have 72gig drives then you would split each drive into 18 x 4gig volumes (meta's) or even 2 gig volumes and then stitch them back together with 4gig volumes from other drives this is done on the SAN through the bin file. (These are called Hypers).  If you need 36gig volumes to be presented to the OS for your application you would combine 8 x 4 gig volumes from separate drives.

    We settled on 32gig volumes presented to the OS as we anticipated our data growth was an addition 32gig per year over the previous year. i.e. 2003 - 150gig, 2004 - 184gig, total for both years 334gig.

    Last year we striped 4 x 32gig volumes together at the OS level to give us one drive of 128gig.  This year we striped 5 x 32 gig volumes at the OS to give us 160 gig.  This could potentially give us 160 gig spread over 40 disks.

    Our database then only consists of one primary file a secondary file for each year and a log file. This has the benefit of simplified management of the database.

    By having the 32gig hypers all preconfigured on the SAN, assigning them for use at the OS is a snap and is done through disk management in windows while the system is up and running and the disk is available for use within 1 hour (need to format the disk). On one of our other SANs from a different vendor we have to remove all the data, allocate more space and put the data back on the disk in order to get the best performance out of the disk.

    Also, remember to format the disks at the OS for 64K as this is the size of a SQL Server extent and its what SQL Server reads and writes at.

    Spend as much time as you can testing different variations of the above if possible.  We had 6 weeks to implement the first and one of our biggest (450gig) and most important (internet banking site) databases on the SAN, we didn't put the database on there until the last week. We got most of our performance benefits in the first 5 weeks of testing.

    Going back to the original post, this is what you get when you let the SAN management software manage the SAN.  The software doesn't know about the application that is running and the best way to configure the disk for your application, or the best part of the disk for your application. The SAN management software will just allocate the next available block. This block may be on the same physical disks as 5 other blocks for the same application. The vendors will say the cache in the SAN will mitigate this problem. The reality is that it will to a point for all applications reading and writing small amounts of data, but when you need the best performance from your disk and database you need to plan what disks are used for what application, how many disks are striped together, what part of the disk is used, the size of the physical disks.

    Large databases sometimes need to trawl through large amounts of data, ie. (spaning multiple years) if you have 10 gig of cache in your whole SAN, your database application may get 2 gig of this.  If you have an application that is getting different data all the time its always going to go to disk regardless of cache because is can't hold the whole database in the cache.

    The secret is to get the best out of the disk and then let the cache give you additional benefits.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply