SQLServerCentral Article

How Long Before Your Database Runs Out of Space?

,

Capacity Planning: Figuring Out How Long Until You Need More Space

The vast majority of databases out there are growing and not shrinking. Some DBA have generous companies that provide all of the storage that they could ever use. Others of us have to be prepared to run out of space.

Why do we care? Do we provision space? Not usually, but if we ignore it, we could be running to the CIO with an emergency out-of-space situation, rather than giving the CIO a six-month – or better one year – space usage warning. The former could put you on the unemployment line; the latter could put you behind the wheel of that new Tesla.

In order to know when a given database may run out of space, you need to know a few facts. For each file group of the database, what are these values:

  • The present size of its files.
  • The amount of free space within its files.
  • The growth increment of the file, and the maximum file size (if any).
  • The free space on disk to grow those files.
  • How rapidly each file group grows, best expressed in megabytes per day.

Capturing Usage Metrics

It is quite easy to capture database usage metrics with some tools. At my shop, we have an excellent monitoring tool with a statistics repository. I’m able to get the data I need by querying that repository. With dozens of live SQL instances and many terabytes of data, this query saves a great deal of time. I need not name what tool I use; suffice it to say that most of the major monitoring tool vendors capture the necessary data. That makes it quite easy since the data are already captured, and one can assemble a table and capture daily size information. For those who don’t have a monitoring tool (please let me know where to send the sympathy card), it is still possible for the DBA to capture file usage stats in a SQL job.

First of all, you need to have a database and table where you can collect the information you need. Here is a cleaned-up version of my table’s CREATE:

CREATE TABLE [dbo].[DBGrowthHistory](
    [TimeCollected] [datetime] NOT NULL,
    [Monitored_object] [nvarchar](100) NOT NULL,
    [DBName] [nvarchar](100) NOT NULL,
    [FGName] [nvarchar](100) NOT NULL,
    [FileLogicalName] [nvarchar](100) NOT NULL,
    [totalmb] [float] NOT NULL,
    [freemb] [float] NOT NULL,
    [growthsremaining] [int] NOT NULL,
    [DBGrowthHistory_ID] [bigint] IDENTITY(1,1) NOT NULL,
    CONSTRAINT [PK_DBGrowthHistory] PRIMARY KEY CLUSTERED
    (
        [DBGrowthHistory_ID] ASC
    )
    WITH
    (
        PAD_INDEX = OFF
        ,STATISTICS_NORECOMPUTE = OFF
        ,IGNORE_DUP_KEY = OFF
        ,ALLOW_ROW_LOCKS = ON
        ,ALLOW_PAGE_LOCKS = ON
        ,FILLFACTOR = 100
    ) ON [PRIMARY]
) ON [PRIMARY];

Our monitoring tool captures the statistics for each file group and file name in the database, every single day. We just extract that information out of the tool and populate the table. For those of us without that luxury, I have devised a query to capture these statistics for every single database on a server. It is based on a query I found online from Satya Jayanty and I am thankful for this helpful query. Let’s have a look at what I was able to muster:

SELECT
    CONVERT(datetime,CONVERT(date,GETDATE())) as [TimeCollected]
    ,@@Servername as [monitored_object]
    ,db_name() as [DBName]
    ,b.groupname AS [fgName]
    ,Name as [FileLogicalName]
    ,[Filename] as [OSFileName]
    ,CONVERT (Decimal(15,2),ROUND(a.Size/128.000,2)) as [totalMB]
    ,CONVERT (Decimal(15,2),ROUND((a.Size-FILEPROPERTY(a.Name,'SpaceUsed'))/128.000,2)) AS [freeMB]
FROM dbo.sysfiles a (NOLOCK)
JOIN sysfilegroups b (NOLOCK) 
    ON a.groupid = b.groupid
ORDER BY b.groupname;

The above query runs in the database for which you desire to collect statistics.  You can easily prepend this query with an insert and sp_msforeachdb in order to sweep all database stats for a SQL Server instance into the table.

Remaining Growth

So we have file sizes and free space in those files.  What we don't yet have is remaining growth.  This is the number of growths that the database file can do before it (a) fills up the disk; or (b) reaches the size limit for the file.  This metric is vital, and it is impossible to calculate an accurate days-remaining prediction without it.  

Example:  Your database has a file A on disk E.  The file is 30000MB with 1MB free, and grows by 1000MB (Never use percentage growth factors!  See below.), with no limit.  The disk is 500GB, which is 524288MB.  The only file on that disk is your database file.  Do you have 1000MB to go before you fill up, or do you have 483000MB?  The latter number is closer to reality.  Put the emergency defibrillator back into its wall kit and relax - or maybe not.

Multiple files on a drive: The chances that you have a 500GB volume with just one 30000MB (29.29GB) file are, in a normal production environment, more remote than winning both the PowerBall and Mega Millions jackpots in the same week.  You are likely to have multiple database files on that drive.  In that event, you have to look at two options:

  1. Using a calculation to split the space between the files; or
  2. Setting a maximum file size for each database file.

I prefer the second option.  Some people are allergic to setting hard maximums, but with that setting, you can get actionable metrics for growth life. Moreover, since you would be assigning max file size so that you can confidently monitor growth and predict days-till-full, you would not be as likely to miss an impending-doom scenario - if you report the metric and pay attention to it.

Mixed files on a drive: This is the real problem scenario.  If the drive is used for non-SQL items (e.g., backups, or [ugh] file shares, the availability of space is unpredictable.  What solution then?  Wild Guess? I think not.  This one is a tough pill to swallow, but I would increase the file to the planned max file size, and then set max file size so that it cannot grow (max file size is equal to current file size or greater by an amount that is less than one growth).  That way, other space hogs cannot grab the disk from you, and your database growth predictions will be all the more accurate.

How to Capture Available Disk Space

Now, if you don't have a monitoring tool that captures free disk space on all volumes and mount points (a mount point is a separate logical drive or LUN that is mounted as a folder name under an existing drive letter - you see these most often in larger SAN implementations that also involve failover clustering), you will need to grab these metrics.  Here is a script that uses powershell and xp_cmdshell to retrieve what you need

sp_configure 'xp_cmdshell',1
go
reconfigure
go
declare @svrName varchar(255)
declare @sql varchar(400)
--by default it will take the current server name, we can the set the server name as well
set @svrName = @@SERVERNAME
set @sql = 'powershell.exe -c "Get-WmiObject -ComputerName ' + QUOTENAME(@svrName,'''') + ' -Class Win32_Volume -Filter ''DriveType = 3'' | select name,capacity,freespace | foreach{$_.name+''|''+$_.capacity/1048576+''%''+$_.freespace/1048576+''*''}"'
--creating a temporary table
CREATE TABLE #output
(line varchar(255))
--inserting disk name, total space and free space value in to temporary table
insert #output
EXEC xp_cmdshell @sql
--script to retrieve the values in MB from PS Script output
select @svrName as [ServerName]
 ,convert(datetime,convert(date,getdate())) as TimeCollected
 ,rtrim(ltrim(SUBSTRING(line,1,CHARINDEX('|',line) -1))) as drivename
 ,round(cast(rtrim(ltrim(SUBSTRING(line,CHARINDEX('|',line)+1
 ,(CHARINDEX('%',line) -1)-CHARINDEX('|',line)) )) as Float),0) as 'capacity(MB)'
 ,round(cast(rtrim(ltrim(SUBSTRING(line,CHARINDEX('%',line)+1
 ,(CHARINDEX('*',line) -1)-CHARINDEX('%',line)) )) as Float),0) as 'freespace(MB)'
from #output
where line like '[A-Z][:]%'
order by drivename
drop table #output
go
sp_configure 'xp_cmdshell',0
go
reconfigure
go
This will show you server name, date, drive, total MB in drive, total free mb in drive.  Why this instead of xp_fixeddrives? If you have mount points, you will not see their space at all with the xp_fixeddrives proc.  The script above does not suffer from this fault.
The Equation
Now with all of the variables, you simply need to calculate for each file group:
Days Remaining = (Free space in files) + (Growth Increment * Number of Growths remaining) / (Growth per Day [averaged])
The lowest number of days remaining from all file groups is how long you have.
It is of course vitally necessary to ensure that all numbers are in the same unit of measure. I reduce everything to megabytes. Average growth per day can be skewed by such things as index rebuilds, archive/purge or extraordinary activity in the underlying application. Obtain these numbers daily, and if outliers exist, decide how to account for them, or to exclude them altogether. Your good judgment should guide you.
Reporting these numbers for a database (or for many) can be done in SQL, SSRS, Excel - whatever floats your boat. If the number of days drops below 365, a friendly note up the chain to management should be considered. If you are under 180 days left, then the note should be more urgent. If you are at 2 days left and haven't notified top management of the danger at least ten times by now, it's time to buy an airplane ticket to a non-extradition country.
You can also calculate the above for different file groups or file locations. Your query may vary to accomplish this.

Things That Will Mess You Up

If you do some things, you will have no reasonable chance of determining how long until your database is full.
Growth by Percentage:  No two growths in a file are likely to be the same. Calculating how many growths are left would leave Stephen Hawking in tears. Moreover, in a growing scenario, each growth is larger than the previous, raising the chance of a growth that is too large for the remaining disk. Always use a fixed number of megabytes as a growth factor. I use 1000MB (1024MB and multiples thereof have previously caused problems in SQL Server).
Mixed-Use Drives: I know I mentioned this already. I am intentionally repeating myself, not because I love redundancy but because I can't stress this enough. If there is any way on Earth to avoid using the same drive for SQL Server files and anything else, do it! With DAS (Directly-attached storage), there are performance benefits to separating drives. This fact also holds true for some SAN assets. But I am not writing this because of the performance concerns. If you have part of the drive assigned to backups or file shares, you must use maximum file sizes in the SQL files and also grow the file to its maximum size in order to ensure that you have an accurate calculation. It does no good to assign a growth maximum that SQL Server can never reach because other files are in the way. This solution may make it difficult to refresh a lower environment with production data, because the restored file sizes would be too large.
Not Checking Frequently:  Not only should you calculate the days-left-to-full of every major production database, you should look at it regularly.  By that I mean every single workday.  If some extraordinary event occurs that sucks up a whole pile of space, you need to know. What if someone puts a file onto your SQL drive without your knowledge? "Hey!  I found 570GB to hold the movies I downloaded from BitTorrent in violation of Company AUP! The DBA won't know because "No worries, there is plenty of space!" can quickly turn into "Could not allocate space for object 'whatever' in database 'PorkChopSandwiches' because the 'PRIMARY' filegroup is full..." Properly calculating days left would show you any kind of impact like that and give you an opportunity to act before the error arises.
Lumping Together Different File Groups:  If you have separate file groups, you should evaluate their growth and days-remaining individually, because SQL Server will not use the space you allocated to one file group to store data from other file groups. In our shop we have a PRIMARY file group, but many databases also have a separate file group for non-clustered indexes. If we ran out of space in the files that make up INDEX but PRIMARY has 1TB of free space, you are just as down as if the whole database was full.

Conclusion

I hope that this little tome is a good 10,000 foot view of SQL Server capacity management.  With a little work, you can stay in command of your environment and keep the plates spinning.

----

John F. Tamburo is the Chief Database Administrator for Landauer, Inc., the world's leading authority on radiation measurement, physics and education. John can be found at @SQLBlimp on Twitter.  John also blogs at www.sqlblimp.com.

Rate

4.67 (24)

You rated this post out of 5. Change rating

Share

Share

Rate

4.67 (24)

You rated this post out of 5. Change rating