Index-Filegroup (how many Files)

Question

Index-Filegroup (how many Files)

info 58414

Hall of Fame

Points: 3386
More actions
July 6, 2017 at 1:41 am

#343449

Hi,
I have created an index filegroup (NonClustered) on a separate device.
How many files do you recommend for this index filegroup (DataWarehouse with 1200 Tables)
CPU=24 Cores; Device=Fusion-IO
Thanks
Regards
Nicole :rolleyes:

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 1

info 58414 - Thursday, July 6, 2017 1:41 AM
Hi,
I have created an index filegroup (NonClustered) on a separate device.
How many files do you recommend for this index filegroup (DataWarehouse with 1200 Tables)
CPU=24 Cores; Device=Fusion-IO
Each Files for every DB - or same Files for all DBs?
Thanks
Regards
Nicole :rolleyes:

You will have to elaborate further on this as you cannot share files between databases and I cannot understand this but being a question on that.
😎.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 2

info 58414 - Thursday, July 6, 2017 1:41 AM
How many files do you recommend for this index filegroup (DataWarehouse with 1200 Tables)

One file in a filegroup. there's no benefit to multiple files in a filegroup if they're all on the same drive.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Kenny Jozi SSCrazy Points: 2027 More actions · Answer 3

One file per file group provided you going to put the on separate disks , even the index filegroup should be on the different disk

SQL 2000/2005/2008/2012 DBA - MCTS/MCITP

Jeff Moden SSC Guru Points: 1003865 More actions · Answer 4

Kenny Jozi - Friday, July 7, 2017 6:37 AM
One file per file group provided you going to put the on separate disks , even the index filegroup should be on the different disk

Usually, not possible on a SAN and having separate filegroups on a SAN won't actually do much for performance because the R/W heads are jumping all over the place for other things. Further, it makes restores a bit more complicated.

Gail knows much more on this subject than I and will hopefully jump in again but I see no reason to put indexes on a separate filegroup on SANs even if you can guarantee they're on a separate physical spindle. You can get much better performance increases if you write better code or fix code that already exists.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003865 More actions · Answer 5

info 58414 - Thursday, July 6, 2017 1:41 AM
Hi,
I have created an index filegroup (NonClustered) on a separate device.
How many files do you recommend for this index filegroup (DataWarehouse with 1200 Tables)
CPU=24 Cores; Device=Fusion-IO
Thanks
Regards
Nicole :rolleyes:

What problem are you trying to solve? If it's for performance purposes (it usually is), I think you'll be disappointed because you can't normally guarantee which disk will be used for what on a SAN. Even if you can make such a guarantee, you might still be disappointed because halving disk response time will pale compared to poorly performing code. Find such code and fix it. Your performance results will be much better. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

Ok Jeff, I'll interrupt my Friday evening... 🙂

There's two main reasons to split up a database into multiple filegroups.
1) Performance
2) Restore times/recoverability.

Splitting for performance will only get you useful results if:
- There is an IO bottleneck
- The drives that the files are being split onto are on separate devices, ideally separate IO paths (depending where the bottleneck is)
- Tables and indexes are split in such a way that the objects on different filegroups are frequently read from disk (or written to disk) at the same time.

Splitting for recoverability requires identifying objects that are critical to the app and putting them (and their indexes) into one filegroup and putting objects that are not as important into one or more other filegroups. This way, in a disaster, you can restore just the critical components, get the app up and running, and restore the rest later.

I really should actually write the article on this that I've been considering for years.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1003865 More actions · Answer 7

GilaMonster - Friday, July 7, 2017 10:13 AM
Ok Jeff, I'll interrupt my Friday evening... 🙂
There's two main reasons to split up a database into multiple filegroups.
1) Performance
2) Restore times/recoverability.
Splitting for performance will only get you useful results if:
- There is an IO bottleneck
- The drives that the files are being split onto are on separate devices, ideally separate IO paths (depending where the bottleneck is)
- Tables and indexes are split in such a way that the objects on different filegroups are frequently read from disk (or written to disk) at the same time.
Splitting for recoverability requires identifying objects that are critical to the app and putting them (and their indexes) into one filegroup and putting objects that are not as important into one or more other filegroups. This way, in a disaster, you can restore just the critical components, get the app up and running, and restore the rest later.
I really should actually write the article on this that I've been considering for years.

Thanks, Gail.

And, yes, I forgot about "get back in business" recovery times. I have a database that was taking more than 10 hours to backup because it was so large. Almost all of the data is static. I partitioned it using 1 file group/file per month and set all the months except current and next month to read_only and did a "final" backup on the read only ones. Backups now only take minutes and, as you've pointed out, I can restore the critical part of the database in minutes and then restore the other file groups on a much less urgent basis.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003865 More actions · Answer 8

Jeff Moden - Friday, July 7, 2017 9:30 AM
info 58414 - Thursday, July 6, 2017 1:41 AM
Hi,
I have created an index filegroup (NonClustered) on a separate device.
How many files do you recommend for this index filegroup (DataWarehouse with 1200 Tables)
CPU=24 Cores; Device=Fusion-IO
Thanks
Regards
Nicole :rolleyes:
What problem are you trying to solve? If it's for performance purposes (it usually is), I think you'll be disappointed because you can't normally guarantee which disk will be used for what on a SAN. Even if you can make such a guarantee, you might still be disappointed because halving disk response time will pale compared to poorly performing code. Find such code and fix it. Your performance results will be much better. 😉

Back to you again, Nicole. What problem is it that you're trying to solve?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Perry Whittle SSC Guru Points: 233865 More actions · Answer 9

Jeff Moden - Friday, July 7, 2017 9:26 AM
Usually, not possible on a SAN and having separate filegroups on a SAN won't actually do much for performance because the R/W heads are jumping all over the place for other things. Further, it makes restores a bit more complicated.

The LUNs present ed by the SAN will usually be backed by a fast access cache, this itslef is usually configurable for the level of reads\writes.

The idea is that I\O requests never actually hit the physical array, the cache contains the "live" data.
In practice this isn't always the case.
One thing is for sure, creating multiple LUNs for differing I\O patterns hitting the same array will just flood the array and SP with multiple I\O requests, something is going to give (and it usually starts with the misconfigured array cache 😉 )

-----------------------------------------------------------------------------------------------------------

"Ya can't make an omelette without breaking just a few eggs" 😉

Jeff Moden SSC Guru Points: 1003865 More actions · Answer 10

Perry Whittle - Tuesday, July 11, 2017 5:18 AM
Jeff Moden - Friday, July 7, 2017 9:26 AM
Usually, not possible on a SAN and having separate filegroups on a SAN won't actually do much for performance because the R/W heads are jumping all over the place for other things. Further, it makes restores a bit more complicated.
The LUNs present ed by the SAN will usually be backed by a fast access cache, this itslef is usually configurable for the level of reads\writes.
The idea is that I\O requests never actually hit the physical array, the cache contains the "live" data.
In practice this isn't always the case.
One thing is for sure, creating multiple LUNs for differing I\O patterns hitting the same array will just flood the array and SP with multiple I\O requests, something is going to give (and it usually starts with the misconfigured array cache 😉 )

Thanks, Perry. Is the interpretation of that "It's not nice to fool Mother SAN"? 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)