At what size db does it make sense to split into multiple files?

  • Btw - I just blogged about this thread and my two responses (hey - a free blog post for me :)) and asked for any opinions and data on if people have seen a need for this. I'd love to know if what we've been preaching isn't so cut-and-dried after all. See http://www.sqlskills.com/blogs/paul/2008/01/29/SearchEngineQA12ShouldYouCreateMultipleFilesForAUserDBOnAMulticoreBox.aspx.

    Cheers

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • I found another MS page on this. Interesting thing about this page is that is says:

    http://msdn2.microsoft.com/en-us/library/ms187087.aspx

    "Most databases will work well with a single data file and a single transaction log file."

    Could that be any more vauge? How would anyone know if their db "is like most"?

    Thoughts?

  • The problem - as with most advice - is that it's extremely hard to make generalizations. This is both because:

    1) without lots of evidence many people (quite rightly) don't believe sweeping generalizations as they may have been bitten by one in the past

    2) nearly every situation is different so many generalizations are useless

    What I'd love to see, (and I tried to do when at MS, and do on my blog or when teaching classes or conferences) is for people to provide the justification for generalizations, plus some idea of the exceptions and the circumstances under which they arise.

    As for this case, I think we've about done this one to death. The sweeping generalizations here are:

    1) for non-tempdb you usually don't need multiple files, unless you have a very high-end workload of the specific nature I described in my first post (rare)

    2) for tempdb you usually do, as long as your workload merits it on a multi-core/cpu box

    3) IO vendors may recommend it for increased IO throughput *on their specific hardware*

    4) there exist sweeping generalizations from various sources that dispute all of the above

    Unfortunately Tom, you're not going to get a definitive, authoritative answer to a design/strategy question such as this and you'll continue to find contradictions to anything anyone says here, and even MS contradicting itself (sigh).

    What I would suggest is the following:

    1) go with the majority opinion of the responses here, based on our collective experience with many customers, databases, and workloads

    2) do your own testing, on your own hardware, with your own workload and see what works for you (but beware that testing in a vacuum can prove or disprove anything you want - which is why you see so many contradictory statements)

    Have fun!

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • One last thing on MS - it's a very big company, with lots of groups. Anyone can sponsor a whitepaper, write a blog post/MSDN article/technet article and publish it, or reply on a forum as a visible MS person and it has the 'official stamp' of coming from MS. When I was in the product group I was continually dismayed by the misinformation, bad advice, contradictions, and baseless assertions that I saw coming from MS employees who were just trying to be helpful.

    Once something's published on the internet, it's *incredibly* hard to undo the damage done. There's a fundamental element of mistrust sometimes on forums and newsgroups which can be wearying when you're trying to help people out. It can be very hard to convince people that someone else's advice isn't the best to follow - I remember several times arguing with people about how CHECKDB works or what a corruption error message means and finally having to resort to 'I wrote that code - I'm afraid you *are* wrong' - which I hate doing.

    Anyway - my final 2c (seems like I've monopolized this thread a bit - sorry :))

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • I hope my thoughts can help clarify some of this.

    I manage 34,547 production databases across 36 SQL Servers, each server is running about 1,000 databases, we have 3 SAN's to hold all this wonderful data.

    (yes we also have staging and development environments too, but don't get me started on that)

    What we've done to improve performance can be done anywhere on any size server, even single disk deployments.

    We've also had several chats with Microsoft who have even been stumped on occasion because apparently SQL Server was designed for large databases in mind, not several thousand small ones. So contention has been an extremely big issue for us at different times. (Small being under 50GB)

    So here's the bottom line:

    For desktop and small business servers:

    In our testing, we've seen that having multiple tempDB's for multiple core machines does provide a concurrency boost thanks to technologies like NCQ built into Seagate drives.

    If you have all your data on a RAID 5 or RAID 6 configured array then there's no point in creating more than 2 tempDB data files. Any more than that will cause you to lose performance due to disk seek time.

    There is absolutely no reason to create multiple data files for a database unless you're experiencing concurrency issues with your databases. Even then, you're not guaranteed that it will help because you cannot control which data file the data will be written to unless you use filegroups and assign critical data tables to that filegroup. And even before splitting into multiple files I'd do some schema analysis to find out whether indexes are being properly used.

    For a desktop or server with multiple *physical* drives, splitting the log files, data files and tempdb away from each other 9 times out of 10 resolves concurrency issues. If you're doing enough transactions on a box that has multiple physical drives, no RAID and is a desktop class machine and you want to split tempDB for performance because all your queries are begin transaction and hold a lock at the table level for 5 seconds before committing the transaction and that's the *unit* of work that has to be done, then you need to break that transaction apart into smaller units if possible or evaluate a different kind of schema. Because splitting data files or tempdb files won't help you there.

    For larger enterprises with a SAN:

    Follow best practices, you'll do just fine.

    In our case however, with the sheer number of databases we have to manage and the number of connections to each server, and number of queries, and all that fun stuff. We have split tempDB by number of sockets, not CPU's. The servers have 4 Quad core processors, which is 16 processors. We actually hurt performance by splitting tempDB into 16 different files. Best performance gain was splitting it by the socket. The reason we got hurt was not because of CPU, but IO and seek times.

    Above all:

    - Best Practices do not trump common sense.

    - Question everything for your environment, do not blindly trust any advice or best practice.

    - Anyone who works with Kimberly Tripp is worth listening too. (Especially if the person offering advice happens to be married to her. e.g. Paul Randal)

    - Remember the advance of CPU power since 2005 has sky rocketed. You've seen the maturity of Dual Core CPU's to Quad core CPU's, marking the most insane advance of processing power ever. I/O is seriously lacking in relation to this surge in technology.

    Christopher Ford

  • Heh... all of that good information to solve, what?... .5% of the total performance problem? The real problem associated with most performance problems has nothing to do with hardware configuration (.5%) and usually has nothing to do with the type or availability of correct indexes (9.5% because people normally do some type of planning there).

    No, 90% of the problem is, simply, crap code. 😉 RBAR here, RBAR there, some hidden RBAR here and there... couple of single row cursors (as opposed to "control loop cursors") here and there because people haven't taken the time to figure out that, YES, it can be done in a set based manner.

    Throw in the RBAR associated with some poorly written UDF's and follow that up with aggregate views of aggregate views and a couple of accidental cross joins that some whacko fixed with a DISTINCT, and you really begin to see that no hardware tuning and no indexing technique in the world is gonna make a hill of beans for performance.

    Even a poorly designed database can see some real performance if the code is written correctly with both performance and scalability in mind.

    Fix the code, first...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Come on now, stretch your creativity here! =) Anyone can point at code or fixing code first as the problem or first step.

    What if you can't fix the code? What else can you do? SQL 2005 has given DBA's so much more power over fixing things without fixing "code". =)

    I did mention fixing code in my post. But that's not what this topic was about. It was about splitting databases. So, in that light, I am just going to assume you can't fix code or you've already fixed code.

    Christopher Ford

  • I think having proper code should be assumed. My original question was asked so that I can properly setup a new db and not worry about adding multiple files later on. Even if there is no performance gain, I do like the ability to manage smaller files. I appreciate all the info, it has been very useful.

  • Christopher Ford (2/13/2008)


    What if you can't fix the code?

    I did mention fixing code in my post. But that's not what this topic was about. It was about splitting databases. So, in that light, I am just going to assume you can't fix code or you've already fixed code.

    Heh... There's always a way to fix the code... That's because if you have a performance problem, it will stand a 90% chance of being a code problem.

    And, yeah, if you reach a bit, the topic actually is about fixing the code... why do you split a database or partition a table? Because you want to improve the performance of what? That's right... you want to improve the performance of the code!

    So spend the time on the code... if you're having performance problems, 90% of the time and ROI will be to fix the code. The hardware just can't handle bad code no matter how robust it is and no matter how many configuration tricks you've laid into it. The fastest dragster in the world won't make it 10 feet down the track if you have a driver that runs it into the wall right out of the gate.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (2/13/2008)


    Heh... There's always a way to fix the code... That's because if you have a performance problem, it will stand a 90% chance of being a code problem.

    I didn't want to get suckde back into this thread but I couldn't let this one pass.

    I disagree somewhat. If you have a 3rd party app, you can't touch the code. Sometimes you can't even change the schema (i.e. adding/changing/removing indexes or stats) without voiding your support agreement with the vendor. So, sometimes the only thing you can do is mess around with the physical setup to improve things. Also, even if you can change the code, its no substitute for good design in the first place.

    I think your estimate of 90% of the time its the code isn't true to real life (from what I and others have seen from many hundreds of customers). Without trying to put a number to it, there's a good proportion of problems that are caused by IO bottlenecks due to poor layout or an IO subsystem that's been outgrown by the workload its trying to support. There's also the cases of perfectly good code that doesn't have the right indexes underneath, or up-to-date statistics, or a badly fragmented indexes, and so on. It's certainly not a code issue 90% of the time when performance is suffering.

    Thanks

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Tom Leykis (2/13/2008)


    I think having proper code should be assumed. My original question was asked so that I can properly setup a new db and not worry about adding multiple files later on. Even if there is no performance gain, I do like the ability to manage smaller files. I appreciate all the info, it has been very useful.

    Sure, I absolutely agree about setting up the DB properly to begin with and have a deep appreciation for that. I got kinda lost in all the conversation about the balancing of files and CPUs, etc, etc, mostly because I also appreciate the info... most folks don't think about any of that until it's already become a problem. I think you're a smart man for planning for future performance and scalability of the hardware because most of that planning just makes it easier on the DBA.

    But if you or anyone else "think having proper code should be assumed", just wait until folks start calling your carefully designed, tweeked/tuned/peaked and balanced, water cooled, fire-breathing, brain-child of a server a piece of crap because their code takes a month of Sundays to run. You know the hardware... you built it... are you still going to assume that the code is proper? You bet your sweet bippy you won't... THEN and only THEN will you agree with what I've been saying... you will only get .5% extra performance by trying to tweek the hardware of even an average not-so-well-thought-out hardware configuration/design... 90% of all performance problems are in the code and there's almost always a way to fix the code.

    I apologize for the soap-box I'm standing on... 😉 It's just that I've seen good people spend months justifying the need for a new "super server", spend tens of thousands of dollars to buy the hardware, spend months setting it up, moving DB's, and testing, only to find there's virtually no difference in the speed of their nightly runs or the throughput of GUI code. Then, management turns to the people involved in the new hardware (that would be YOU, right?) and says, "Fix it or your fired". You KNOW there's no fixing the hardware... it's perfect... go ahead... say it... you "a$$-u-me the code is correct". :hehe:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Paul Randal (2/13/2008)


    So, sometimes the only thing you can do is mess around with the physical setup to improve things.

    What are you going to mess around with in the phsical setup? There's enough information in this thread alone to guarantee that whoever is setting up a system, will set it up just about perfectly! The changing of the physical setup usually means degrading it because it was darned near perfect to begin with. It's the cruddy 3rd party software you bought that muddied up the waters. Make them fix it! And spend the same time evaluating software as you did the hardware so you don't buy cruddy 3rd party software.

    Also, even if you can change the code, its no substitute for good design in the first place.

    Precisely... you setup the hardware perfectly to begin with... like I said, only .5% possible improvement on the hardware because it was almost perfectly setup. Run's like a clock...

    ... but the mouth-breathers that built the 3rd party software you speak of didn't take the same class on "good design in the first place" as you did. Their only interest was to put product on the street and either you or your bosses made the mistake of allowing that monstrosity into your server. Almost nothing you can/could do to the server will help because 90% of the problem is in the 3rd party code. And I can prove it without even being there... the server ran fine with other software until you installed that particular 3rd party code... now where do you suppose the problem actually is and how much improvement do you think you're going to get by tweeking on the server or the storage? :w00t:

    It's ok for you to disagree and take major exception to what I'm saying... you weren't there...

    ...all I know is that shifting from Standard Edition to Enterprise Edition on a brand new 16 CPU water cooled box with 32 gig of Ram and all the spindle balancing and file splitting techniques in the world on a brand new super-high-speed disk storage system didn't solve the average 640 deadlocks per day, nor did it fix the 10 minute long full server lockups that occured 5 times a day, nor did it make a 24 hour job that usually failed run in 15 minutes with no failures all while doing 50% more work, nor did it change dozens of 8-10 HOUR runs into 3-6 MINUTE runs (just to name a few of the dozens of problems I've encountered on just one production server)... nope, 90% of it was either us or the vendor changing the code (after we beat them to death, of course), 9.5% was done by re-evaluating some crucial indexes, and .5% was done by tweeking the hardware (we replaced one questionable drive)...

    ... my hardware guys loved it... we saved their "shadow makers" because the ol' man in the corner office was eating hubcaps and spitting out nails after spending all that money on new hardware... with virtually zero improvement in performance...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Cool - you've had different aggregated experiences from me, hence our disagreement in the percentages. I tend to deal with very high-volume VLDBs with well-written code where hardware usually is a major concern. But everything you say about 3rd party code/apps is obviously very true.

    If the code's near-perfect but the workload goes up such that the IO or CPU becomes saturated, there's nothing you can do in the code - you need bigger hardware or better physical file layout. You can't usually rearrange/change code to gain IO bandwidth or service more connections. For example, in a high-volume OLTP system, with great code, having a transaction log on a 5400RPM RAID5 array is going to be the high-order bit in fixing performance, 🙂

    My point (as I made above) is that sweeping generalizations are very hard to make - people always have different experiences and will disagree - doesn't mean either viewpoint is incorrect and worth arguing to death to defend.

    Anyway - nice side discussion from the main point of the thread.

    Paul Randal
    CEO, SQLskills.com: Check out SQLskills online training!
    Blog:www.SQLskills.com/blogs/paul Twitter: @PaulRandal
    SQL MVP, Microsoft RD, Contributing Editor of TechNet Magazine
    Author of DBCC CHECKDB/repair (and other Storage Engine) code of SQL Server 2005

  • Ok, this will really light your fire :hehe:

    The reason I was asking is becuase it's my database, for my company, that I manage the db for, because I'm anal when it comes to just how insanely far I can go to inprove performance, and I don't want anyone else touching it. This is my own personal fire-breathing dragon, and no one's code but mine gets to swim in the acid pool. The info I've gotten here is very useful and I appreciate it a great deal. Btw: My code is top notch, probably because I learned most everything about how to write proper code on sites like this, from folks such as yourselves. So thanks for that as well. 😛

  • Absolutely agree with that... that's what makes forums like this so interesting and useful... differing experiences and opinions...

    By the way... We did run into the problem of "If the code's near-perfect but the workload goes up such that the IO or CPU becomes saturated, there's nothing you can do in the code" that you speak of... if someone of your caliber where to deem the code "near perfect", then I might believe them. Our problem was that all of the prima-donnas in our shop insisted that their code was "near perfect", that no one could write better code than they or that the code had been "tuned over the years to a state of virtual perfection" as one manager put it...

    ... heh... they were so wrong...

    Anyway, thanks for going back and forth a bit on this with me. I realize that I deviated quite a bit from the OPs original intent of the thread, but when I see so many speaking of hardware as a possible performance solution, the soap box magically slips under my feet, the war drums sound off in the background, and new dents appear in the podium from where I pound my fists. Thanks for listening, folks... 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 40 total)

You must be logged in to reply to this topic. Login to reply