Question on Filegroups, that I think I know the answer too...

  • I've been reading around, on "best (or recommended) practices" for SQL Server. One recommendation seems to be to have ".25 files per CPU core" for your DB. So, if I am reading this right, if I have a single core CPU, I should "split" my DB over 4 files.

    Now, I understand the basic concept of a filegroup (a "logical" container for the DB) so that's not the problem, or question.

    My question is, if I have an existing DB, which is on a single file (call it DB.mdf / ldf,) and I want to add additional files to it, can I simply go into the properties for the DB in SSMS, go to the "Files" page, and click "Add," and create a second (or third / fourth / etc) file, perhaps called "DB2.mdf?" Will the system then (presuming I also make sure this new file is in the correct filegroup) "spread" the DB out, or will I then need to "move" tables / indexes to the new file, presumably with T-SQL queries?

    I realize it would also likely help performance to spread these files over several spindles, as well, if possible.

    Thanks,

    Jason A.

  • jasona.work (11/19/2009)


    I've been reading around, on "best (or recommended) practices" for SQL Server. One recommendation seems to be to have ".25 files per CPU core" for your DB.

    That recommendation is for TempDB. Splitting user databases is based on IO load, backup strategy and other things.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • SQL Server will not spread out to these files automatically, though if they're in the same filegroup, SQL Server will use the space in a proportional manner related to sizes.

    http://msdn.microsoft.com/en-us/library/ms187087%28SQL.90%29.aspx

  • That recommendation is for TempDB. Splitting user databases is based on IO load, backup strategy and other things.

    The recommendation I saw for TempDB was 1 file per core.

    Did I misread?

    Thanks,

    Jason A.

  • Steve Jones - Editor (11/19/2009)


    SQL Server will not spread out to these files automatically, though if they're in the same filegroup, SQL Server will use the space in a proportional manner related to sizes.

    http://msdn.microsoft.com/en-us/library/ms187087%28SQL.90%29.aspx

    So, based on that, if I create more files, all part of the same filegroup (call it filegroup 1) for the DB in question, SQL will simply use the space in them, based on how much free space is available in each file. So, unless I also spread these files over multiple spindles, I'm not likely to get any performance gains, either, correct?

    FWIW, the DB in question is also using the Simple recovery model. This was done, as the DB in question does not require point-in-time recovery.

    Thanks!

    Jason A.

  • Depending who you read, recommendations for TempDB range from 0.25 files/core to 1 file/core. Regardless, the X files per Y cores is a standard recommendation for TempDB and is done for reasons of allocation contention.

    The decision whether or not to split up user databases (and if so how) is not simply based on X files per Y cores, but rather should be done if you're running into IO bottlenecks (and then the files must go onto separate physical drives) or if you want to use file/filegroup backups. Either way, the design around splitting user databases is far more complex than just X files per Y cores and let SQL handle it (which is what is done for TempDB)

    Let me see if I can find some stuff....

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Gail has listed some of the things to think about. If you want to split for performance, you are looking to (usually) separate out data and/or indexes to separate spindles and get better performance. A simple example might be a large or heavily accessed table for OLTP that you want to separate out from other tables because it's constantly used in clustered index scans/seeks for joins. Or maybe you'd move larger indexes to a separate file and try to get simultaneous access to two separate sets of data.

    It is a complex process, and it matters for tempdb as well. Plenty of systems work fine with one tempdb file, and may not benefit from more unless you can separate out the files and you make extensive use of tempdb.

  • jasona.work (11/19/2009)


    Steve Jones - Editor (11/19/2009)


    SQL Server will not spread out to these files automatically, though if they're in the same filegroup, SQL Server will use the space in a proportional manner related to sizes.

    http://msdn.microsoft.com/en-us/library/ms187087%28SQL.90%29.aspx

    So, based on that, if I create more files, all part of the same filegroup (call it filegroup 1) for the DB in question, SQL will simply use the space in them, based on how much free space is available in each file. So, unless I also spread these files over multiple spindles, I'm not likely to get any performance gains, either, correct?

    FWIW, the DB in question is also using the Simple recovery model. This was done, as the DB in question does not require point-in-time recovery.

    Thanks!

    Jason A.

    When filling space in the files, SQL uses a round-robin method - fill one file then move to the next and then the next, before growing any files.

    As for reading from the multiple files, you could see some read benefits since SQL could allocate a separate thread to each file

    Other possible reasons to create more files goes along with table partitioning, archival of data, read vs. write files and filegroups.

    Do you recall the source that recommended the 25 files?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • CirquedeSQLeil (11/19/2009)


    When filling space in the files, SQL uses a round-robin method - fill one file then move to the next and then the next, before growing any files.

    That's log files. Data files use a proportional fill so all files in one filegroup should fill up at the same time

    As for reading from the multiple files, you could see some read benefits since SQL could allocate a separate thread to each file

    I'm afraid that's a myth.

    http://blogs.msdn.com/psssql/archive/2007/02/21/sql-server-urban-legends-discussed.aspx

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (11/20/2009)


    CirquedeSQLeil (11/19/2009)


    When filling space in the files, SQL uses a round-robin method - fill one file then move to the next and then the next, before growing any files.

    That's log files. Data files use a proportional fill so all files in one filegroup should fill up at the same time

    As for reading from the multiple files, you could see some read benefits since SQL could allocate a separate thread to each file

    I'm afraid that's a myth.

    http://blogs.msdn.com/psssql/archive/2007/02/21/sql-server-urban-legends-discussed.aspx

    Seen the round robin fill on multiple occasions in the data files. Most recently while converting a VLDB from a single file to use multiple filegroups. While moving the data from one file (and one filegroup) I monitored the size of the files in the new filegroups to check for progress. File 1 would fill up and then move to File 2. When File 2 reached it's limit, the data would start pumping back into File 1 again. Since I did not have enough space on the drives to allow for both the original files and the new files to be at the appropriate size, I needed to allow for AutoGrowth - making the round robin more evident.

    I had seen that Blog before. I have seen both sides of the argument from CSS blogs. Then throw in the Microsoft Documentation - makes the topic unresolved for me. It is for those reasons, I throw out the "could" - but it probably should have been left out.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • CirquedeSQLeil (11/19/2009)

    Do you recall the source that recommended the 25 files?

    It wasn't "25" files, but ".25" (point 25) files per core.

    Here's one of the places I found the recommendation, look at numbers 7 and 8:

    http://technet.microsoft.com/en-us/library/cc966534.aspx

    Jason A.

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply