Blog Post

File This Under File Groups (Part 1)

,

I never had cause to give much consideration to storage when I was a SQL developer. The companies I worked for had folks to manage that, and my focus was on coding. We had standards we had to follow: no user objects on the PRIMARY file group, always specify a file group when creating a table or index, always create at least one additional file group besides PRIMARY when creating a database: but that was the extent of it.

Moving over to the admin side of the fence highlighted my ignorance for me. I suddenly had to deal with autogrowth settings run amok, file groups with an astonishing number of files in them, file groups with one file that had grown to many gigabytes, multiple transaction logs, files that didn’t seem to be used by anything, disk space running out, and so on. It’s been a challenge to get a handle on all this, and it’s most assuredly a work in progress, but at least it’s progress.

File groups represent the top level of the storage tree. They are logical constructs, a way to group the physical files that actually house the data. Every database has at least one file group, PRIMARY, by default. Many times, PRIMARY is the only file group, and everything gets dumped there. This isn’t necessarily a bad thing, but there seems to be a consensus in the SQL Server community that PRIMARY should be reserved for system objects, and additional file groups created to house the user objects. The PRIMARY file group is also the default file group, meaning that if another file group isn’t specified as part of the CREATE statement, the object will be placed on the default. You can (and likely should) specify a secondary file group as the default, to be certain that nothing but system objects reside on PRIMARY. There is a limit to the number of file groups that can be created: 32,767: but I can’t imagine anyone coming close to that number, so for all intents and purposes, you can have as many as you want or need.

Each file group, to be useful, needs at least one file in it. How many files is optimum? Things get a little fuzzy for me here. Too few can cause poor performance; so can too many. At this point, a short foray into how SQL Server uses files is necessary.

Files in a file group are used in a “round-robin” fashion, with each file receiving a share of the data. How much of a share a file receives is governed by proportional fill, so that files with larger amounts of free space receive comparably larger amounts of data. Ideally, all the files would be very close to the same size, with close to the same amount of free space, so that incoming data would be evenly and quickly distributed.

So: how many files, then? As always, the answer is, “It depends.” One file might be fine for slow growth groups such as domain values; larger, busier groups should likely have more. “Eight” seems to be a bit of a magic number for many, or a number based on NUMA cores in one fashion or another. There are many opinions on the subject out there in SQL land, and I’m still sorting through them.

What seems certain to me is that only one file group with only one file is not a good idea, at least not for any database of any size over a few megabytes. I’m presently wrestling with how to farm out 800GB worth of data from a single MDF file, a condition found in two of my databases. Another problem I’m faced with is file groups – usually PRIMARY – with very large numbers of files: I’ve got one database with 37 files on PRIMARY, all of varying sizes. I’m thinking that round-robin distribution and proportional fill mean SQL Server is using these files inefficiently, and I’ll need more file groups with fewer files to even things out.

As I mentioned at the start, this is new territory for me, and I’m learning as I go. What about you, reader? I’d enjoy hearing from other DBA’s about their storage trials and travails, what worked and what didn’t. This is the first of I don’t know how many parts, and in the future I’ll be writing about my progress – or the lack thereof – in understanding and making the most of SQL Server storage.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating