Should I have Multiple data files for large DB (250GB)?

  • I have a DB that is around 250GB that is growing rapidly. It is running on a 4 proc clustered server with EMC SAN for storage. The entire database resides on a single data file within a RAID 5 partition, and as of right now it is performing very well. However, I am wondering if I should create additional data files as this database grows in size. If all the data will reside on a single RAID 5 partition, are there any advantages to having multiple data files? It may become as large as a Terabyte within a year.  

     

    Thanks in advance.

    Ethan

  • Firstly if you intend to use multiple filegroups it is better to place each on a seperate array.

    Second using multiple filegroups means you have to start figuring out how your database works so you can use effective splitting of tables or indexes into the groups.

    Otherwise things you can do include making your raid 5 a raid 10 as that will speed up writes. You need to see if your database is making extensive use of tempdb, in which case you might consider splitting tempdb onto a seperate array. 250Gb isn't really that big - the main issue with databases around this size and greater is the storage of backups and sometimes the time to do the backup.

    You might want to consider partitioning of the data, many fast growing apps I've dealt with may store up to 80% of the data in a couple of tables, I have one database where 95% of the data is in one table - only partitioning would help here.

    My best advice is to analyse your database structure and its use before making any decisions on filegroups.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • hi Ethan,

    I think that a single data file is absolutely inadequate for a very large db as in your case, specially if it is growing so fast. Without consider performance and scalability-related issues, scattering relational data, blob data and indexes among different datafiles (belonging to different FILEGROUPS) helps you to define and maintain an effective backup strategy.

    Please refer to SQLBOL "File and Filegroup Backup and Restore" for further infos.

     

    best regards

    Marco Buttazzoni

  • Encountered very similar situation but where a database grew to a large size very quickly (reasons not pertinent). I had to move it to a machine where I could get the SAN storage defined to keep it growing.  I learned on that DB move that several smaller files (smaller in my case was 50-72 G each) is better because it leaves you more options.  When I moved the DB, I couldn't get a single disk defined  quick enough that was large enough to hold the largest DB file.  The only thing I could do to move the DB was to reload it (backup and restore were not an option).  I had a lot of space to deal with but the EMC storage I had access to was already defined into disk of sizes 72G and below.  It made me reconsider disaster recovery and DB file sizes.  In the event of a disaster and I had to recover any DB, I could get the space - but what if I didn't have the space in large enough chunks?  Files small enough to deal with is what I decided was best for my situation.  I would suggest considering disasters and business cases for server moves as part of your determination of how to break up the DB of that size.

  • What software are you using for backups?

  • Thanks for all your responses. It looks like I only have a couple of choices. One would be to partition the data on separate physical raid sets, and the other would be to change from raid 5 to raid 10. At this point we don't have the luxury of either option, but at least I have some info to pass on to management.

    The backup software we use is EMC BCV which allows us to have a syncronized full backup by just doing incremental updates at night. Very Cool.

    Again, thanks for the info.

    Ethan

  • Couple things which you might want to consider for the future are.

    1) have an additional HBA card. We have 100 MBPS HBA cards and for huge databases it used to be a bottle neck at times.

    2) The SAN configuration on EMC. We have run into isssue where EMC will present one big meta volume of 150 gigs and later on we found that some were concatenated, not striped properly etc. According to EMC, in our shop the best configuration they recommended was breaking into 64 GB volumes. They can maximum stripe across 8 disks only.

    We use EMC symmetrix SAN

  • SAN's are another subject all together. I did think most SANs connected by gb connections not by 100Mb fibre cards to the server. To be honest a single 100Mb connection will be a major bottleneck from your server.  We have 300Gb stripe sets across 10 spindles on our emc SAN.  ( sorry I'm not a SAN expert - I just ask for the disk arrays and they appear!! ) pretty good performance too.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply