Estimating Your Space Needs

How do you estimate what a database size will be?

What do you consider and how close do you have to be to

reality?

I had to respond to that question many times. I had to come up with estimates many times also, and people were always surprised to get pages of explanation to accompany the magic number… It made the understanding more obvious, if you are not just estimating space but are estimating a storage strategy as well.

I mean let’s face it. An estimate is an estimate. The difficulty comes when you do not put enough in perspective the number you are giving. If you are too low, you might be blamed because it will represent a non-budgeted cost; hopefully you caught it before the database ran out of space. Storage solutions cost can be expensive and you have got to come with solid justification so it can be properly budgeted.

You need room to store the data to.

You might have redundancy needs.

You might have fault tolerance needs.

You might have special backup needs.

You might be looking at an expensive SAN solution to help cover all of the above.

Everyone in the community knows that data and log files, tempdb and system databases files as well as the OS and program files do not belong well together. And what about the page files? And what happens when you try to put many different databases files on the same disk system. Two OLAP or OLTP databases side by side will certainly affect each other …

You must make sure I/O performance will be constant and predictable. You must make sure you will have enough disk space to reach the goal you were given so you have to estimate the space the database will occupy in X months, in X years or whatever. You might want to have free space so you can use (almost) non-disruptive online re-indexing options.

Now, with all this in mind, let’s take a look at what is needed to purely store the data to disk.

You can use the data dictionary of an existing database to get the information you need. Then you can put the information into an Excel spreadsheet and fine tune your numbers. Following you will find an example of the command you can run to extract the basic information you need about tables and indexes structures. You will see that two views are created prior to the query. This is to hide a bit the complexity the query would have shown without them.

The query (code in the Resources Section) is a complex derived table type join based on the new views as well as the system tables they are built on. Before each step you will find within the query an explanation of what it is getting. Numbers and formulas used represent pretty much what SQL Server data structure is using right now. Some data types and details might be missing but it is a fairly good start.

You can put the result of this query in an Excel spread sheet and start to play with the numbers and have Excel calculate the new results for you.

You will also find a link at the bottom (Resources Section) to an example of an Excel worksheet you can use to manipulate the numbers and then use to document and justify your needs. I suggest you past data in the Excel spreadsheet carefully so you do not loose the formulas. You can also build an Excel form from scratch and ameliorate the prediction mechanism and put in better formulas of your own. This spreadsheet calculates space based on an existing database that will see its row increase normally

distributed. It might not be exactly the real behavior a database would have in real life.

The important is that you have something to start with and an understandable way of explaining your numbers into a nice Excel format everyone will easily be able to read.

Andre Vigneau MCSE, MCDBA

Is XML the Answer?

by Don Peterson

SQLServerCentral.com

Miscellaneous

New Author! Don Peterson writes his first article for us and explores why he considers XML to be...bad! There are some interesting points made here and if you've haven't thought about what XML means to you as a DBA, it's a subject worth spending some time on.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(79)

You rated this post out of 5. Change rating

2008-05-02 (first published: 2003-10-07)

64,756 reads

Discuss

What's ROI Got to Do With It?

by Additional Articles

DevX

Miscellaneous

This article by Simon Galbraith (from Red-Gate Software, one of our valued advertisers!) in .Net Magazine talks about how to calculate ROI when evaluating software purchases.

2003-02-03

1,058 reads

Discuss

Worst Practice - Bad Comments

by Andy Warren

SQLServerCentral.com

Miscellaneous

This one is pretty interesting, Andy discusses a few things he sees in comments that not only fail to add value, they end up costing extra time. There's room for discussion here, but definitely a discussion worth having - comments can make you or break you, here's a chance to think about what you think is important in commenting and pass that on to your development team.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(2)

You rated this post out of 5. Change rating

2003-01-23

11,094 reads

Discuss

Who Needs Change Management?

by Greg Robidoux

SQLServerCentral.com

Miscellaneous

You have spent thousands of dollars on that cool technology; clustering, redundant controllers, redundant disks, redundant power supplies, redundant NIC cards, multiple network drops, fancy tape backup devices and the latest and greatest tape technology. You are all set. There is no way your going to have downtime. Right?