SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Optimal Database Filesize in Filegroup


Optimal Database Filesize in Filegroup

Author
Message
christian_t
christian_t
Say Hey Kid
Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)

Group: General Forum Members
Points: 681 Visits: 531
Hi there,

I was wondering if there is experience one where about the optimal filesize for databases (2012) lies.

I am creating a new table that will have quite an amount of rows.
Approximately about 1,5 billion rows per month.
The size of a rows will be 20 byte.
As I have a date-column I will be partitioning by date.
As this table will be filled for many years I also decided to create a table for each year and UNION ALL them with a view (constrain on date, so the optimizer can eliminate tables within the view)

I have 24 LUNs in the system, so I will create 24 files for each filegroup to equally arrange the data.
For each (year-)table I will have about 335 Gbyte so I would create 24*14GB-Files for each year.

However I could go further on and create 24 files for each month, so that would be about 1GB per file having 12 filegroups for each year (arranging it with partition function/scheme)

Version a: The partitioning will be over days but all in one filegroup. (Files 24x14GB)
Version b: The partitioning will be over days, each month in one filegroup. (Files 12x24x1,2GB)

I am not worried about the "work" to create all these files and groups but I was wondering, what is better for the sql server.
More smaller files or fewer bigger files.

Thx
Mitch
Jayanth_Kurup
Jayanth_Kurup
SSCertifiable
SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)SSCertifiable (5.5K reputation)

Group: General Forum Members
Points: 5491 Visits: 1351
I personally think smaller files are better than bigger files. If you have a lot of data that gets queried/modified frequently then it makes sense to spread it over many small files/disk luns to improve IO. Also it could help your SLA with piece meal restores in case of a DR. Large files probably make sense for less frequently queried data that doesnt change and probably belong to a read only filegroup. The data stored in these files also impact the files. If you are going to query the latest data 8 times out of 10 then small or big file size doesnt matter, the key is to store it together since most operations will need to happen within that partition.

Probably a hybrid approach with more files and filegroups for the initial 6 months could work better , have you explored that option ?

Jayanth Kurup
GilaMonster
GilaMonster
SSC Guru
SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)

Group: General Forum Members
Points: 218005 Visits: 46278
mitch.fh (1/2/2013)
More smaller files or fewer bigger files.


Unless all those files are on different physical drives, fewer bigger files is better. There's no gain from having lots of little files on the same RAID array, and there is a downside when you have *lots* of files in terms of time to open them, the overhead of calculating proportional fill and such.

If you have 24 LUNs, then there's no gain from more than 24 files and possibly a loss.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


christian_t
christian_t
Say Hey Kid
Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)

Group: General Forum Members
Points: 681 Visits: 531
I got two answers with two totally different opinions Wink

Unless all those files are on different physical drives, fewer bigger files is better. There's no gain from having lots of little files on the same RAID array, and there is a downside when you have *lots* of files in terms of time to open them, the overhead of calculating proportional fill and such.

The 24 LUNs is all I got Wink Every filegroup would put it files in the same LUNs
I think that like the cost of parallelism at one point exceed the benefit there is a "break-even point" for database files.
But maybe there is the same effect on the file size and it would be contrary (too big file getting slower)

Probably a hybrid approach with more files and filegroups for the initial 6 months could work better , have you explored that option ?

Under the assumption that it has a benefit, that could work, but I would have to merge continually older partitions in to a big one.

Load and query:
I will write the data date by date in this table. Usually "today" I will not write data that is for older dates then "yesterday".
The queries will be more frequent for the newer dates - approx. up to 6 month.
Older dates will be queried in about 10-20% of the queries. However for those 10-20% I need the complete data.
GilaMonster
GilaMonster
SSC Guru
SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)

Group: General Forum Members
Points: 218005 Visits: 46278
mitch.fh (1/2/2013)
I think that like the cost of parallelism at one point exceed the benefit there is a "break-even point" for database files.


Don't make the mistake of thinking that multiple files are necessary for multiple threads to access. It's an old myth. Any thread in SQL can issue IOs to any file at any time.

There is no advantage to having multiple files on the same LUN (well, performance advantage, there are other reasons one might spilt, for backup/restore options). There's also no performance degradation for large files. If you're taking about files on the same LUN, 1 file of 10 TB is the same as 10 files of 1 TB.

As for multiple contrary opinions, welcome to the internet :-D. Guess you'll have to research and see who's opinion you would trust.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


christian_t
christian_t
Say Hey Kid
Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)Say Hey Kid (681 reputation)

Group: General Forum Members
Points: 681 Visits: 531
As for multiple contrary opinions, welcome to the internet :-D. Guess you'll have to research and see who's opinion you would trust.

over the last few years, you provided many good advises, so I have no reason to doubt you Wink

there are other reasons one might spilt

I am aware of the other reasons

There's also no performance degradation for large files. If you're taking about files on the same LUN, 1 file of 10 TB is the same as 10 files of 1 TB.

That is the information I was looking for w00t

I will stick to the plan and separate the years in different tables and different filegroups. Within the years I have paritions on the date but they will all go into the same filegroup (24 Files/ 1 per LUN).

The Paritioning within the table helps on queries and if I want to delete a whole date I can switch the partitions out of the table.
The different tables leave me the posibility to move older years to other LUNs. On Recovery/Restoring I will have some advantages to as I will not have to restore the whole table with all years if I have a recovery reason.

Constrains and Partitioning will do the magic on eliminating everything that is not relevant for the queries.

Thanks
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search