SAN best practices

  • Does anyone have any best practice documents/white papers on SAN configuration for SQL Server?

    I'm having a rather large fight with the storage team, who want to give me non-dedicated LUNs of no more than 200 GB. This is for a database with a primary file of 600GB and a log file of 250GB.

    The san is to be shared amoung many servers and I'm very worried that I may end up with drives shared between my transaction log/tempDB and the exchange server.

    I'v also been told that I cannot specify the RAID level, and the level I get will depend on a cost analysis done by the storage team's manager

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Gail

    there is a webcast aboutthis, at least I hope it's still available.

    http://msevents.microsoft.com/cui/WebCastEventDetails.aspx?EventID=1032275439&EventCategory=5&culture=en-US&CountryCode=US

    [font="Verdana"]Markus Bohse[/font]

  • Thanks. I'll check that out when I get home.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (5/8/2008)


    Does anyone have any best practice documents/white papers on SAN configuration for SQL Server?

    I'm having a rather large fight with the storage team, who want to give me non-dedicated LUNs of no more than 200 GB. This is for a database with a primary file of 600GB and a log file of 250GB.

    The san is to be shared amoung many servers and I'm very worried that I may end up with drives shared between my transaction log/tempDB and the exchange server.

    I'v also been told that I cannot specify the RAID level, and the level I get will depend on a cost analysis done by the storage team's manager

    I don't have any whitepapers handy, but sharing LUN's with Exchange Server was one of the primary causes for sending one of our busy SQL Servers into a tailspin (we were queuing IO's in the thousands during busy times). You're right to fight it.

    That being said - I'd also get this new "position" in writing. If they're going to chintz on your config, and not be willing to give you some say, then they need to take the blame when stuff stops performing well.

    No say whatsoever for you is not a reasonable position, and that needs to come out.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Storage take blame for a poorly performing database? What universe are you in? 😉

    That's my worst fear, and it's something I've had before. A datawarehouse load that was using the same fibre switch as my DB took down the business critical app for close to 12 hours.

    Some small excerpts from the most recent mail:

    Raid type is not dictated by anyone other than the storage team. In [head office] I would be suprised if the DBAs could tell you which system was on what raid type.

    The application team shouldn't be requesting dedicated drives. This defeats the entire purpose of a SAN. The performance will be horrible.

    I'm keeping on fighting, but I'm being dictated to, not consulted at this point.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Gail,

    You have to make them prove that they don't have bottlenecks on switchs in the fabric or the disks. Make them provide proof of throughput and go back to the SAN vendors. They often have to really dig in to check performance and a lot of times the sales/setup crews from the SAN vendors don't provide those tools at first.

  • GilaMonster (5/8/2008)


    Storage take blame for a poorly performing database? What universe are you in? 😉

    I know, I know.... I keep hoping to catch up with the tooth fairy - that little witch owes me money:)

    That's my worst fear, and it's something I've had before. A datawarehouse load that was using the same fibre switch as my DB took down the business critical app for close to 12 hours.

    Some small excerpts from the most recent mail:

    Raid type is not dictated by anyone other than the storage team. In [head office] I would be suprised if the DBAs could tell you which system was on what raid type.

    The application team shouldn't be requesting dedicated drives. This defeats the entire purpose of a SAN. The performance will be horrible.

    I'm keeping on fighting, but I'm being dictated to, not consulted at this point.

    Since this is business critical - perhaps take another approach (this got us some more traction):

    Business critical = reliability+uptime + performance. If you don't get to call the shots on the specific config, that's fine, but they must GUARANTEE that you will get the responsiveness and reliability you require. When we had our 9-month battle over this with our storage team- the SANAlyzer tool (which WE had to FORCE them to buy, those cheap B***ds - who else gets away with buying some 10M$ in hardware with no monitoring or tuning tool?) proved our point to a T when we finally convinced them to run the darned thing. (All LUN's our server "shared" were running at 110% of their rated max performance, and we were literally shutting out anyone else in the loop controller cache).

    You have SLA's to meet, established with the higher-ups, so the config they give you MUST be able to meet and exceed said SLA's, and they MUST be able to "prove" that they are. Notice there wasn't a mention of COST in there - that's not your concern after - that's theirs. Require monitoring, establish with them what parameters for performance are acceptable (perhaps in IOP/s, or no more than x % in SAN controller cache saturation, clean SANalyzer assessments, or on your terms: responisiveness as defined by xxx....), and what is to be done when those are not met. Address contingencies, DR, backups, etc...

    Yes - it's a lot more work, but with "claiming turf" comes "accountability". If they want to kick you out of the storage area, then they need to prove that they can handle it.

    You're not dealing with a file server after - this is as you describe a Mission-critical app, which in definitions around here means that cost is a distant third-level consideration, and NOT the first or second.

    Finally - don't take no. Go up the food chain, all of the way up to whoever determined that this app is mission-critical if need be. I had to go "through" my manager, over the director, around the storage manager, just to throw a hissy fit in the CIO's office, just so that this popped up on the "radar".

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Thanks. I'll check them out.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • That web cast (Best Practices for Deploying SQL Server 2005 on Large Storage Systems) is still available. I just downloaded it. Thanks!

    Tim

  • This material seems quit interesting ...

    - http://www.sql-server-performance.com/articles/per/2000io_config_sannas_p1.aspx

    - http://sqlserver-qa.net/blogs/perftune/archive/2008/01/28/3300.aspx

    - http://blogs.msdn.com/benjones/archive/2005/09/12/463964.aspx

    - http://search.techrepublic.com.com/search/best+practice+and+hp+storageworks+and+microsoft+sql+server.html

    - http://www.dell.com/content/topics/global.aspx/power/en/ps3q01_muirhead?c=us&cs=555&l=en&s=biz

    - http://h71028.www7.hp.com/ERC/downloads/4AA0-2948ENW.pdf

    (started at http://whitepapers.zdnet.com/abstract.aspx?docid=166254 )

    Some of them are based on SQL2000, but I think the top basic rules are still valid.

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Excellent. Thank you.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Found a couple of SQLCAT blog posts:

    Deploying SQL Server 2005 with SAN #1

    SQL Server 2005 Configuration Blog #2.doc

    Deploying SQL Server 2005 with SAN #3

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Seing that sqlcat list, this may also add to a well concerned decision:

    http://www.microsoft.com/technet/prodtechnol/sql/bestpractice/pdpliobp.mspx

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • I seem to recollect we have had this same discussion so many times over the last few years. As a certified ( http://www.snia.org ) person I'd suggest you challenge the storage teams knowledge, training and qualifications. Shared LUNs for anything like exchange or database is going to be a problem.

    The important point is that SAN storage is no different than DAS, so ask your storage team if they'd be happy to create one raid 5 or raid 6 ( as they won't tell you the raid I guess it's 5 or 6 ) and then sub divide your sql drives across it and what sort of performance they'd expect?

    I find an unbelievable arrogance from storage teams and some san vendors about databases and sans and this very subject matter.

    The point is that SAN storage is just a better way of managing storage and making it available it doesn't give any magical performance benefits or make raid 5 outperform raid 10 for writes. You'd not do anything different on a SAN than you'd do with DAS as far as LUNs / arrays / drives.

    You also need to make sure you get partition alignment - you might like to read this http://sqlblogcasts.com/blogs/grumpyolddba/archive/2008/04/03/disk-partition-alignment-sans-and-diskpart.aspx

    Take it from me shared LUNs are a disaster for a SQL server if it's a busy server.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

Viewing 15 posts - 1 through 15 (of 57 total)

You must be logged in to reply to this topic. Login to reply