Disk arrangement

  • Good morning-

    I am going to rearrange the HDD configuration in our Production server; currently everything (logs, data, tempdb) is on one RAID 5 array, which is suboptimal. Ours is a reporting/data warehouse application that follows these steps:

    Deltas are pulled down from Enterprise systems overnight into a staging database (DB 1).

    When these pulls are complete, updates are applied to our warehouse database (DB 2).

    When the ODS updates are applied, processes that apply the changes to the database our report developers use (DB 3) start.

    When this step completes, the process that build the reporting tables or refresh reports for the various business units begin, usually loading tables from DB 3 to DB 4, 5, 6, etc.

    Given this, I have two options for disk rearrangement (max of 4 arrays):

    1. OS/Apps on array 0; tempdb on array 1; logs on array 2, data on array 3

    2. OS/Apps on array 0; tempdb and logs on array1; warehouse (DB 2) and reporting (DB 3) on array 2; staging (DB 1) and business unit (DBs 4,5,6,etc) on array 3

    Given our situation, what are pros/cons of each arrangement?

    Thanks,

    Jason

  • What do you mean array 0, 1, 2, 3, ...?

    If your configuration will be built on a single RAID 5, there is no difference in performance.

  • Sorry, I wasn't clear; currently it is one array; I want to split the drives into 4 arrays (0,1,2,3).

  • If my guess is correct, you have a RAID 5. You would like to partition this array into 4 logical drives.

    If so, there is no help in enhancing performance.

  • Asusming you're using Array to mean "RAID group" - then I'd assume you'd see SOME improvement. If we're talking about sub-partitions of a single RAID group, then SQL Oracle's right - no reason to expect any perf gains.

    The big difference I'm seeing is how much tempDB gets used. If it gets used a LOT during the load process, then having it on the same volume as the logs is going to be very slow. If it doesn't - then I'd think 2 would be better, since you're talking about largely DW and reporting work (and no OLTP activity). That would then entail relatively minor amounts of log use, so logs and tempDB shouldnt' be fighting too much.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I was meaning to add two more RAID arrays to what we have noe; sorry, it's a bit confusing. I want to add two more RAID arrays to the server, giving us a total of 4 (we currently have two; one RAID 1 array for OS and 1 RAID 5 array for everything else)...

    I think i have the answer though; we do tend to use tempdb a lot.

    Thanks all!!

  • I don't know if this is a possibility given your hardware constraints, but there is another combination that would also be good to try.

    Array 1 - OS/Apps

    Array 2 - TempDB & Logs

    Array 3 - Database Files

    Array 4 - Index Filegroups

    Whatever you do, you should try to stay away from RAID 5 on a reporting server since it is the slowest for writes. RAID 5 is great for 'reads', but terrible for 'writes'. With a data warehouse, writing data during the ETL process is #1. The caveat would be unless you have a very large ETL window to allow for extra write time.

  • Jason Marshall (3/19/2008)


    Good morning-

    I am going to rearrange the HDD configuration in our Production server; currently everything (logs, data, tempdb) is on one RAID 5 array, which is suboptimal. Ours is a reporting/data warehouse application that follows these steps:

    Deltas are pulled down from Enterprise systems overnight into a staging database (DB 1).

    When these pulls are complete, updates are applied to our warehouse database (DB 2).

    When the ODS updates are applied, processes that apply the changes to the database our report developers use (DB 3) start.

    When this step completes, the process that build the reporting tables or refresh reports for the various business units begin, usually loading tables from DB 3 to DB 4, 5, 6, etc.

    Given this, I have two options for disk rearrangement (max of 4 arrays):

    1. OS/Apps on array 0; tempdb on array 1; logs on array 2, data on array 3

    2. OS/Apps on array 0; tempdb and logs on array1; warehouse (DB 2) and reporting (DB 3) on array 2; staging (DB 1) and business unit (DBs 4,5,6,etc) on array 3

    Given our situation, what are pros/cons of each arrangement?

    Thanks,

    Jason

    Hi Jason,

    To ensure a fairly high performance configuration, you would need to consider RAID levels on the different arrays. As already indicated by an earlier post, RAID 5 is great for reads and bad for writes, thus you will need to figure this into your configuration.

    As the source data is loaded into the staging database (DB1) , which is shared with the developers database, this will be write intensive and would command a RAID 10 array with ample disk for good throughput. This would hold true for array supporting the data warehouse and reporting arrays, as during refresh cycle, this is once again a write intensive process commanding RAID 10. In terms of reporting, ideally you would want a RAID 5 array, but as you are max. on # of arrays, you will need to go with your original plan and share with data warehouse database. To ensure fault-tolerance of database logs, you will want to have a RAID 1 or 10 array to support log files and tempdb. In terms of O\S and applications, you should go with a RAID 1 configuration for fault-tolerance.

    Overall, your option 2 is the best, but just make sure RAID levels are correct for type of data access patterns you experience.

    Thank you,

    Phillip Cox

    MCITP - DBAdmin|MCTS - SQL Server 2005

  • Thanks for your responses, everyone.

    We're constrained by the total number of physical drives we have available (10). Using RAID 1 for OS/apps and Tempdb/logs gives me six drives left. Given this, and the fact that 300 GB drives are the biggest, fastest (10K RPM) I can get for the server we have, would your recommendation still stand Phillip? I was thinking of using two RAID 5 arrays (three drives each) for the database files in this situation.

    Thanks!

  • Jason Marshall (3/20/2008)


    Thanks for your responses, everyone.

    We're constrained by the total number of physical drives we have available (10). Using RAID 1 for OS/apps and Tempdb/logs gives me six drives left. Given this, and the fact that 300 GB drives are the biggest, fastest (10K RPM) I can get for the server we have, would your recommendation still stand Phillip? I was thinking of using two RAID 5 arrays (three drives each) for the database files in this situation.

    Thanks!

    Hey,

    How many physical arrays do you have (e.g. MSA1000 or MSA1500 etc..)?

  • We're using a Dell PowerEdge 2900; I can have a mximum of four arrays, if I'm understanding your question right...

  • Stay away from RAID 5 and SQL Server data, it would be better to have the SQL Server data on single drives with a good Full Recovery Mode backup strategy. This way you can restore very close to the crash with Full backup, Transaction Logs, and the log file (if you have the log file on another disk). So you have a good recovery strategy and none of the overhead of RAID 5. (I learned the hard way on this one. I had the same set up you did, all db's on one RAID 5. I moved the db's to their own disks and performance has gone up exponentially. I was disk bound all around.)

    I suggest RAID 1 if you can. If you are running out of drive space in the server, you can always get a Dell PowerVault to hook up to it via SCSI (blazing fast). But definitely save yourself some trouble, don't go RAID 5, there are too many drives to write data on to get any kind of performance under load. Especially for Data Warehouse systems where there are a lot of big inserts updates and deletes, or in other words I/O.

    Good Luck!

    jim

  • Hi Jason,

    Sorry, ok. You have a PowerEdge 2900 with all internal storage and you can fit a maximum of 10 disks into the server. You want to chop-up these 10 physical disk into seperate arrays?

    Just want to be sure of setup.

    Thanks,

    Phillip

  • Yes Phillip, exactly. I can have a max of 4 seperate arrays.

  • Jason Marshall (3/20/2008)


    Yes Phillip, exactly. I can have a max of 4 seperate arrays.

    Ok, this changes things a bit, as you are constrained to limited disks. you can do as planned, but due to lack of actual spindles, its going to be tricky in terms of I/O performance. As you are limited in number of disk, I would do following:

    Array 0 = Raid 1 (2 disks) OS and Apps

    Array 1 = Raid 1 (2 disks) Tempdb and Logs

    Array 2 = Raid 10 (2 disks) Staging and Developer DB's

    Array 3 = Raid 10 (4 disks) Warehouse and Reporting

    You will need performance for writing to DW and reading for reports, thus 4 drives to support reads and writes. Another possiblity is to just Raid 10 all the disk and use logical drives, as this will provide the maximum amount of disk I/O.

    Thanks,

    Phillip Cox

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply