March 23, 2008 at 9:52 pm
Hi all,
Our company is about to implement a datawarehouse on SQL Server 2005.
We have 3 physical blade servers and a SAN. We need to utilize the power of them to run ETL jobs. At the time being we're likely to have the following options:
+ Use VMWare to cluster these 3 blade servers into a logical servers
+ Use Windows Server 2003 clustering
+ Install 3 instances of SQL Server 2005 and use MSX/TSX to send jobs to target servers?
What are the pros and cons of each? and are there any other approaches?
The other question is that each of the databases (Staging, Datamarts etc) is now about 100GB. At this size should we keep each of the DBs in one file or split them into multiple files? if yes what is the optimal size of each file?
Thanks for your advice
March 25, 2008 at 6:47 am
to be honest it's far too general a question to answer. I've actually found that etl / DW prcesses are most affected by spindle and array count. Just to explain, if you read and write to the same physical spindles for a process you'll get poor performance. Try this on your home pc if it has two seperate physical disks: zip a large file on the same disk, now zip it from one to the other and see the difference.
You should understand the affect of raid and spindles so we'll take that as read, so all you need to do is get your flow avoiding disk contention. cpu is useful especially in transformations which are doing lots of math/conversions - you'll probably find memory has a big part to play too.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
March 25, 2008 at 6:54 am
oh as to filegroups etc. I've seen too many poor implementations of filegroups ; you MUST understand your data usage to make effective use of filegroups. Make sure you do implement files per filegroup as this will give you benefits. As you mentioned SAN storage make sure this is configured and set correctly, lots to lose here.
Use partitioning with files and filegroups; however as has been mentioned in other threads if you use an 8 core box and create say 8 filegroups with 8 files each that puts your database on 64 physical files - consider the management of all this vs the questionable miniscule performance gains.
As always test, test and test again.
Also remember the 20 / 80 rule ( 20% hardware, 80% software/code ) in tuning, you'd want to concentrate on the code first, that's where the real gains are to made.
As a footnote I've not found vmware good with largish databases, say around 1TB. 100Gb is quite small these days, put 128gb of memory on yoru server and you should get most into memory which is much better!
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
March 25, 2008 at 6:55 am
one final point, and I don't want to sound holier than you, but you should have planned all the hardware and structures way way back before you got to the implement stage.
[font="Comic Sans MS"]The GrumpyOldDBA[/font]
www.grumpyolddba.co.uk
http://sqlblogcasts.com/blogs/grumpyolddba/
March 26, 2008 at 9:37 pm
Thanks Colin for your advices. Unfortunately these hardware has been purchased way before we implement the DW, and it was considered redundant at the time being!
Yes we're doing a lot of testing and comparison to get the most out of it
Thanks again
Viewing 5 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply