Index-Filegroup (how many Files)

  • Hi,
    I have created an index filegroup (NonClustered) on a separate device.
    How many files do you recommend for this index filegroup (DataWarehouse with 1200 Tables)
    CPU=24 Cores; Device=Fusion-IO

    Thanks
    Regards
    Nicole :rolleyes:

  • info 58414 - Thursday, July 6, 2017 1:41 AM

    Hi,
    I have created an index filegroup (NonClustered) on a separate device.
    How many files do you recommend for this index filegroup (DataWarehouse with 1200 Tables)
    CPU=24 Cores; Device=Fusion-IO
    Each Files for every DB - or same Files  for all DBs?

    Thanks
    Regards
    Nicole :rolleyes:

    You will have to elaborate further on this as you cannot share files between databases and I cannot understand this but being a question on that.
    😎.

  • info 58414 - Thursday, July 6, 2017 1:41 AM

    How many files do you recommend for this index filegroup (DataWarehouse with 1200 Tables)

    One file in a filegroup. there's no benefit to multiple files in a filegroup if they're all on the same drive.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • One file per file group provided you going to put the on separate disks , even the index filegroup should be on the different disk

    SQL 2000/2005/2008/2012 DBA - MCTS/MCITP

  • Kenny Jozi - Friday, July 7, 2017 6:37 AM

    One file per file group provided you going to put the on separate disks , even the index filegroup should be on the different disk

    Usually, not possible on a SAN and having separate filegroups on a SAN won't actually do much for performance because the R/W heads are jumping all over the place for other things.  Further, it makes restores a bit more complicated.

    Gail knows much more on this subject than I and will hopefully jump in again but I see no reason to put indexes on a separate filegroup on SANs even if you can guarantee they're on a separate physical spindle.  You can get much better performance increases if you write better code or fix code that already exists.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • info 58414 - Thursday, July 6, 2017 1:41 AM

    Hi,
    I have created an index filegroup (NonClustered) on a separate device.
    How many files do you recommend for this index filegroup (DataWarehouse with 1200 Tables)
    CPU=24 Cores; Device=Fusion-IO

    Thanks
    Regards
    Nicole :rolleyes:

    What problem are you trying to solve?  If it's for performance purposes (it usually is), I think you'll be disappointed because you can't normally guarantee which disk will be used for what on a SAN.  Even if you can make such a guarantee, you might still be disappointed because halving disk response time will pale compared to poorly performing code.  Find such code and fix it.  Your performance results will be much better. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok Jeff, I'll interrupt my Friday evening... 🙂

    There's two main reasons to split up a database into multiple filegroups.
    1) Performance
    2) Restore times/recoverability.

    Splitting for performance will only get you useful results if:
    - There is an IO bottleneck
    - The drives that the files are being split onto are on separate devices, ideally separate IO paths (depending where the bottleneck is)
    - Tables and indexes are split in such a way that the objects on different filegroups are frequently read from disk (or written to disk) at the same time.

    Splitting for recoverability requires identifying objects that are critical to the app and putting them (and their indexes) into one filegroup and putting objects that are not as important into one or more other filegroups. This way, in a disaster, you can restore just the critical components, get the app up and running, and restore the rest later.

    I really should actually write the article on this that I've been considering for years.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster - Friday, July 7, 2017 10:13 AM

    Ok Jeff, I'll interrupt my Friday evening... 🙂

    There's two main reasons to split up a database into multiple filegroups.
    1) Performance
    2) Restore times/recoverability.

    Splitting for performance will only get you useful results if:
    - There is an IO bottleneck
    - The drives that the files are being split onto are on separate devices, ideally separate IO paths (depending where the bottleneck is)
    - Tables and indexes are split in such a way that the objects on different filegroups are frequently read from disk (or written to disk) at the same time.

    Splitting for recoverability requires identifying objects that are critical to the app and putting them (and their indexes) into one filegroup and putting objects that are not as important into one or more other filegroups. This way, in a disaster, you can restore just the critical components, get the app up and running, and restore the rest later.

    I really should actually write the article on this that I've been considering for years.

    Thanks, Gail. 

    And, yes, I forgot about "get back in business" recovery times.  I have a database that was taking more than 10 hours to backup because it was so large.  Almost all of the data is static.  I partitioned it using 1 file group/file per month and set all the months except current and next month to read_only and did a "final" backup on the read only ones.  Backups now only take minutes and, as you've pointed out, I can restore the critical part of the database in minutes and then restore the other file groups on a much less urgent basis.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Friday, July 7, 2017 9:30 AM

    info 58414 - Thursday, July 6, 2017 1:41 AM

    Hi,
    I have created an index filegroup (NonClustered) on a separate device.
    How many files do you recommend for this index filegroup (DataWarehouse with 1200 Tables)
    CPU=24 Cores; Device=Fusion-IO

    Thanks
    Regards
    Nicole :rolleyes:

    What problem are you trying to solve?  If it's for performance purposes (it usually is), I think you'll be disappointed because you can't normally guarantee which disk will be used for what on a SAN.  Even if you can make such a guarantee, you might still be disappointed because halving disk response time will pale compared to poorly performing code.  Find such code and fix it.  Your performance results will be much better. 😉

    Back to you again, Nicole.  What problem is it that you're trying to solve?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Friday, July 7, 2017 9:26 AM

    Usually, not possible on a SAN and having separate filegroups on a SAN won't actually do much for performance because the R/W heads are jumping all over the place for other things.  Further, it makes restores a bit more complicated.

    The LUNs present ed by the SAN will usually be backed by a fast access cache, this itslef is usually configurable for the level of reads\writes.

    The idea is that I\O requests never actually hit the physical array, the cache contains the "live" data.
    In practice this isn't always the case.
    One thing is for sure, creating multiple LUNs for differing I\O patterns hitting the same array will just flood the array and SP with multiple I\O requests, something is going to give (and it usually starts with the misconfigured array cache 😉 )

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

  • Perry Whittle - Tuesday, July 11, 2017 5:18 AM

    Jeff Moden - Friday, July 7, 2017 9:26 AM

    Usually, not possible on a SAN and having separate filegroups on a SAN won't actually do much for performance because the R/W heads are jumping all over the place for other things.  Further, it makes restores a bit more complicated.

    The LUNs present ed by the SAN will usually be backed by a fast access cache, this itslef is usually configurable for the level of reads\writes.

    The idea is that I\O requests never actually hit the physical array, the cache contains the "live" data.
    In practice this isn't always the case.
    One thing is for sure, creating multiple LUNs for differing I\O patterns hitting the same array will just flood the array and SP with multiple I\O requests, something is going to give (and it usually starts with the misconfigured array cache 😉 )

    Thanks, Perry.  Is the interpretation of that "It's not nice to fool Mother SAN"? 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply