Impact of multiple user database files?

  • My company recently migrated to new hardware. The decision was made to create a database file per core per filegroup so now the database is spread across > 80 files. All of these files reside on one RAID10 volume. I cannot find any evidence online to support this configuration for a user database (only tempDB and even then its 1/2 to 1/4 the number per core). Is this configuration ok or is there something we should look at specifically to see if it is causing performance issues?

  • What is the size of your database?

    Do you have 80+ RAID10 volumes?

    You're good with the tempdb files. Start with 1/4 and add up to 1/2 (the number of cores) if necessary.

    This survey is pretty good - http://www.sqlskills.com/blogs/paul/files-and-filegroups-survey-results/

    Regards,

    Igor Micev

    Igor Micev,My blog: www.igormicev.com

  • Database was ~250GB but then was sized up to 750GB to avoid auto-grow so there is a ton of free space in there.

    All files reside on a single RAID10 volume. This setup would have made more sense to me if each file was on its own set of drives but that is not the case.

  • The one database file per core recommendation is an outdated one for TempDB only. It's never been a recommendation for user databases and even for TempDB now it's 1/4-1/2 the number of cores, max 8 to start.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • 80+ filegroups is too much for that size of a database.

    80+ files is also a big number, but it's more normal for files.

    There are benefits of having multiple filegroups. Do you have a reason for having such many filegroups?

    I've worked on 100-500GB databases. They were having 15-20 filegroups and double number of files.

    However you can see in the survey that there are databases like yours.

    Igor Micev,My blog: www.igormicev.com

  • Igor Micev (3/14/2014)


    80+ filegroups is too much for that size of a database.

    80+ files is also a big number, but it's more normal for files.

    There are benefits of having multiple filegroups. Do you have a reason for having such many filegroups?

    I've worked on 100-500GB databases. They were having 15-20 filegroups and double number of files.

    However you can see in the survey that there are databases like yours.

    Sorry I wasn't very clear. There are four file groups all residing on one RAID10 volume:

    Primary: one file

    Filegroup2: forty files

    Filegroup3: forty files

    Filegroup 4: one file

    I was told that this was done to reduce read and write latency but i just can't find any evidence that this is correct.

  • There's no need for 80 database files for a user database. This gives you no performance benefit at all. All you are doing is making the database very hard to manage when it comes to moving the database to another drive, server, restoring the database, etc.

    I would highly recommend removing the extra database files from the database. One file per LUN is all you need. I've got clients with 3TB databases and only two data files and they perform just fine.

    99.9% of SQL instances need no more than 8 files for tempdb. The only reason to have more than 8 tempdb database files is because you are seeing latch waits for PFS pages (or GAM, but that's even less likely). If you are seeing PFS page waits then you should increase the number of files. If you aren't seeing PFS page waits then adding tempdb files won't do anything but slow down the SQL instance restart time. Of all the clients I work with only one that has more than 8 tempdb files on an instance, and they have 110 tempdb files.

    The files per core thing is a total myth at this point. The recommendation was made back in the SQL 2000 time because we only had 1 core per CPU so the math worked out easily.

  • Sorry I wasn't very clear. There are four file groups all residing on one RAID10 volume:

    Primary: one file

    Filegroup2: forty files

    Filegroup3: forty files

    Filegroup 4: one file

    I was told that this was done to reduce read and write latency but i just can't find any evidence that this is correct.

    That's because there isn't any benefit to doing this.

  • msmithson (3/14/2014)


    My company recently migrated to new hardware. The decision was made to create a database file per core per filegroup so now the database is spread across > 80 files. All of these files reside on one RAID10 volume. I cannot find any evidence online to support this configuration for a user database (only tempDB and even then its 1/2 to 1/4 the number per core). Is this configuration ok or is there something we should look at specifically to see if it is causing performance issues?

    Having 80 files for a DB is not a good design. It will have a negative impact on the performance and make administration complex.

    --

    SQLBuddy

  • I guess I'll be alone in this. 😀 I don't believe it's necessarily a bad design at all.

    First, having the 80 files won't degrade performance. It won't enhance it, but it won't degrade it.

    As for maintenance and administration, it won't cause any heartache if you've automated things like backups. It fact, it could reduce backup times if your automation checks to see if there have been inserts, updates, or deletes and backing up only those files that have actually experienced a change.

    It will also make dropping a customer easier in the future if the system was setup to tolerate just dropping a file.

    Last but not least, if will also allow for "piecemeal" restores if that was a consideration in the grand design of things.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/17/2014)


    First, having the 80 files won't degrade performance. It won't enhance it, but it won't degrade it.

    Too many data files can degrade performance. What too many is depends on a bunch of factors.

    See http://www.sqlskills.com/blogs/paul/benchmarking-do-multiple-data-files-make-a-difference/, where adding files improved performance up to 8 files, then degraded it after that point.

    It fact, it could reduce backup times if your automation checks to see if there have been inserts, updates, or deletes and backing up only those files that have actually experienced a change.

    That's feasible with lots of filegroups, not so much with lots of files in a small number of filegroups. Can greatly complicate restores, especially if the backups are purely based on changes. Log backups will have to be retained for long periods (from the least recently backed up file right up to present) to allow for a restore.

    It will also make dropping a customer easier in the future if the system was setup to tolerate just dropping a file.

    If the database had been set up with a small number of files per filegroup and one customer per file group, then yes (kinda, still have to empty the filegroup before dropping the files), but in this case with 40 files per filegroup across two filegroups, that's not going to be the case.

    Last but not least, if will also allow for "piecemeal" restores if that was a consideration in the grand design of things.

    Again, if there were lots of filegroups, yes. Less useful when there's loads and loads of files in a single filegroup as all files in a filegroup have to be restored for the filegroup to be available.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • To be clear, my response was based on the OP's claim of having a single file per filegroup. I absolutely agree that having a shedload of files per filegroup can cause performance problems when there are too many.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/17/2014)


    To be clear, my response was based on the OP's claim of having a single file per filegroup.

    He corrected that statement in a later post

    msmithson (3/14/2014)


    Sorry I wasn't very clear. There are four file groups all residing on one RAID10 volume:

    Primary: one file

    Filegroup2: forty files

    Filegroup3: forty files

    Filegroup 4: one file

    A small number of files per filegroup with multiple filegroups is useful for recovery, possibly backups, performance only to a point. Typically splitting to multiple filegroups for performance and splitting to multiple filegroups for recovery/restore options result in quite a different table-to-filegroup design. My personal preference is to split once it's been decided which one we're after, performance or restore/recovery options.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Ah. That makes all the difference in the world. I also see where I missed two key words in the original post. :blush: "per core".

    Thanks for the feedback, Gail.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • One of the things you are doing with a large number of physical files is forcing SQL Server to use at least one IO stream per file (assuming all files are needed to answer your query). SQL may decide that your access plan would benefit from multiple IO streams to some or all of the files.

    Therefore if your database is held in 80 files instead of 1, then you are forcing SQL Server to use at least 80 IO streams. This could be good for high-speed parallel access to data, but only if your storage subsystem can handle the load.

    If you plot response time to % busy on your storage system, you are almost certain to find a hyperbolic curve, where response time looks OK up to maybe 95% busy but then deteriorates rapidly.

    If you force too many IO streams into your subsystem, you may find one or more components get too near 100% busy. Your IO performance becomes far worse than if you had fewer IO streams and maybe 80% busy storage subsystem.

    What is 'too many' will depend on your storage subsystem. It may be worth working with your storage expert to model the potential IO rate if no bottlenecks existed compared to the physical characteristics of the data transfer rate on your motherboard, NIC, and external storage subsystem capabilities.

    You can then have a better idea of the maximum throughput possible with a given response time. From there you can look at either getting investment to improve the hardware to cope with the requested IO or spending time reducing the file count to match what the hardware can do.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply