Sizing the Data File SQL 2008R2

  • Hello,

    I have come across several blogs/articles that recommend sizing the SQL data files to be roughly a year's growth away. This would help alleviate fragmentation as SQL wouldn't have to autogrow the files.

    I created a brand new database with 2 blank tables all in Primary filegroup (2MB total). I did a full backup (test_1.bak).

    Next I increased the size of the data file from 2MB to 2000MB and performed a full backup (test_2.bak).

    Test_2.bak was larger than the first one(test_1.bak) by ~3000KB although I did not perform any additional inserts/deletes. WITH COMPRESSION had the two very close, separated by ~10KB.

    Why is this, and should I pregrow my data file a year into the future?

    Thanks

  • The reason the two backups were different sizes is almost certainly due to the additional space map pages that get inserted into SQL Server as files grow.

    A single space map page can only manage a specific number of data pages. Therfore the larger the database file the more space map pages you need. Books Online has the details, but a Google search often gts you to the answer faster.

    It is best to work out how much disk space every database will need for the coming year for the following reasons:

    1) The time required to purchase more disk space if it is needed is normally far longer than the time between a DBA noticing that a disk is almost full due to unplanned growth and that disk filling up and the business no longer being able to store data.

    Good capacity planning should be a core part of a DBA's work.

    Also, not working out the 1-year database size for a new database means a key metric in working out how much the application will cost to run is not known. If your managers care about business cases then they need these figures as part of working out if the application is cost-effective.

    2) Windows file fragmentation is a cause of poor database performance. If you let your daatbase grow by small increments then you are putting an upper limit on its performance that you cannot fix by index rebuilds.

    The only way to fix windows file fragmentation is to do a Windows defrag, and this will slow your database while it is running.

    3) Every time a database or a log file grows, all insert (and sometimes all read) activity is halted until the file growth completes. These are seen by the user community as outages, and a good DBA should try to avoid such outages.

    You can avoid the time penalty for data files by enabling instant file initialisation (Google or BOL can tell you what you need to do). But you cannot avoid the time penalty for log file growth.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • In my opinion, it seems counter-intuitive to pick an arbitrary time limit for sizing the data. If I know I have 2tb of disk space available and I know it will take 2 years to reach 500gb, why not just size it for 500gb? Unless of course we have other db's, are planning on buying more hardware in a year, planning on adding more dbs in a month, but not new hardware for a year... All of these things should be taken into account when determining this.

    Jared
    CE - Microsoft

  • Ultimately you can pick whatever time horizon you want. However, there are factors you should include when picking your time scale...

    1) What is the realistic time between raising a concern about space shortage and the point at which that space is provided. This should cover the scenarios of a) providing additional space out of a free space pool and b) having to purchase hardware to provide the space. You should look at actual duration over at least the past 2 years to work out the realistic as opposed to the politically correct time.

    Capacity planning should always look at longer than the realistic time to do hardware provisioning.

    2) What type of budgeting time line does your management work to. Many managers have to estimate their CAPEX for the coming financial year about 4 months prior to year end. They often get jumped on (and loose some bonus) if they exceed the CAPEX allowance.

    Capacity planning needs to support the management budget process, so once each year you should plan storage and CPU use for about the next 16 months.

    3) The longer the timescale the less reliable it will be. Businesses change and what they need to do changes and the kit needed to do these things changes.

    Capacity Planning experts reckon +/- 25% is the best anyone can give for short term planning, and for anything over a year +/- 50% is the best that can be given. If you are not a capacity planning expert then your estimates will have less accuracy. All capacity plans should be checked against reality, and the reason for any differences resolved.

    You should not let lack of accuracy prevent you from doing the best capacity plan you can, as the alternative to planning is often chaos.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • I appreciate the answers guys, like the detailed explanations you posted Ed. Really shed some light onto the subject, some things I will definitely keep in mind. Thread completed! 😀

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply