Database Growth Trends – Idera Diagnostic Manager

How fast are your databases growing? Should I panic about a 1 TB database that has 100 GB free on disk? Am I safe with a 100 GB database that also has 100 GB free? Based on those sizes…I have no idea.

It’s possible that the 1 TB database is purging data as fast as it takes it in, so it’s been at 1 TB for a long time and may never use the 100 GB it has free. Looking at the 100 GB database, it may have been a new 1 MB database created two months ago and grew 1 MB at a time to get to where it’s at now. (Bonus points if you panicked reading that last sentence.)

The thing to look for is how big that database was before. What did you write down for how much space it used last week, last month, and several months ago? Do you remember all those stock broker commercials saying “Past performance doesn’t predict future results” which is supposed to lead you on to think “Yeah, but it’s the best indicator I have.” The same goes for database growths. So, you did write down what it was before, right?

My Default Monitoring Message

I’ll copy/paste this section on multiple posts, so feel free to skip it if you’ve seen it before.

The biggest issue is the word “before”. If you weren’t watching your servers before then you have nothing to compare the current state of your server to. Also, if you’re watching it consistently then you’ll be able to make an apples to apples comparison.

Your production servers should have some kind of monitoring software in place, and the monitoring software is already capturing this type of information to help make all the pretty charts and graphs you see through the GUI. You need this software to run so you can get alerted when things go wrong, and pulling this data will just make it more valuable.

Monitoring software doesn’t perform magic, so you can pull this data on your own if you don’t have any. However, that’s taking on a lot of responsibility, time, and effort. If something is collecting the data already, take advantage of that.

For me, I use Idera Diagnostic Manager, so my queries are going to look at the data collected using this tool. Other monitoring tools are going to collect the same basic data in a database you can read.

If anyone from one of these companies wants to provide me with scripts that fill the same need, I’ll make a write up on them and give you credit for being awesome supporting your product like that. I can only write against what I can test against, it’s nothing personal.

What I’m Watching

I want to know when my files will hit an autogrowth and when they’ll run out of space.

Running out of space is pretty obvious. If you’re out of space, every command that needs more space will fail, and users will run to the supply closet for torches and pitchforks. It’s a bad day for the DBA.

Autogrowth is less of an issue and less obvious. That’s not saying it’s not an issues, especially for larger databases that can’t take advantage of IFI. The command that needs more space, and any that pile up behind it, will wait patiently (barring any timeouts) for the growth to take place. However, SQL Server can be more patient than your users or applications.

That being said, if I knew I’d need more space than is allocated in the database using a simple script then I could do it when no one was waiting for the space. No complaints, just make the files larger and move on with my life. Oh, happy days.

To get all of this information I need to know a couple things:

How much space is used in each filegroup
How much space is allocated to each filegroup
How much can each filegroup grow according to disk free space and growth rates

Idera DM Captures That

Well, Idera does a pretty good job. It captures the data size (used), data expansion (growth by growth rate and disk free space), and data file size (allocated). The problem is that it captures this stuff on the database level, not the filegroup level. This means that it’s possible for a filegroup to run out of space while the database as a whole has tons of free space.

I’m currently working in an environment where all data files for a single database are on the same drive, so it’s less concerning to me. It is something to keep in mind, and may be very important to your environment. If this is critical to you, I touch on how to capture this data yourself below.

If separating filegroups isn’t an issue for you (it often isn’t) then all you have to do to trend growths with Idera is to write a very long script that hits SQLdmRepository..DatabaseSize a couple times. My script goes a little like this:

/*
MaxUsed is always from a week long range.  The reason is that this captures any spikes in usage such as off-hours index maintenance.
*/
;WITH MostRecent AS
(
SELECT X.DatabaseID
, X.UTCCollectionDateTime
, S.InstanceName
, D.DatabaseName
, DataFileSizeMB = DataFileSizeInKilobytes/1024
, DataFilePotentialSizeMB = (X.DataFileSizeInKilobytes + X.DataExpansionInKilobytes)/1024
FROM (
SELECT DS.DatabaseID
, RowNum = DENSE_RANK() OVER (PARTITION BY DS.DatabaseID ORDER BY UTCCollectionDateTime DESC)
, UTCCollectionDateTime
, DataFileSizeInKilobytes
, DataExpansionInKilobytes
FROM SQLdmRepository..DatabaseSize DS
WHERE UTCCollectionDateTime > GETUTCDATE()-7 
) X
INNER JOIN SQLdmrepository..SQLServerDatabaseNames D ON X.DatabaseID = D.DatabaseID
INNER JOIN SQLdmrepository..MonitoredSQLServers S ON S.SQLServerID = D.SQLServerID 
WHERE RowNum = 1
)
, ThisWeek AS
(
SELECT DS.DatabaseID
, MaxUsedMB = Max(DataSizeInKilobytes + TextSizeInKilobytes + IndexSizeInKilobytes)/1024
FROM SQLdmRepository..DatabaseSize DS
WHERE UTCCollectionDateTime > GETUTCDATE()-7 
GROUP BY DS.DatabaseID 
)
, OneMonth AS
(
SELECT DS.DatabaseID
, MaxUsedMB = Max(DataSizeInKilobytes + TextSizeInKilobytes + IndexSizeInKilobytes)/1024
FROM SQLdmRepository..DatabaseSize DS
WHERE UTCCollectionDateTime BETWEEN GETUTCDATE()-37 AND GETUTCDATE()-30 
GROUP BY DS.DatabaseID 
)
, ThreeMonth AS
(
SELECT DS.DatabaseID
, MaxUsedMB = Max(DataSizeInKilobytes + TextSizeInKilobytes + IndexSizeInKilobytes)/1024
FROM SQLdmRepository..DatabaseSize DS
WHERE UTCCollectionDateTime BETWEEN GETUTCDATE()-97 AND GETUTCDATE()-90 
GROUP BY DS.DatabaseID 
)
, SixMonth AS
(
SELECT DS.DatabaseID
, MaxUsedMB = Max(DataSizeInKilobytes + TextSizeInKilobytes + IndexSizeInKilobytes)/1024
FROM SQLdmRepository..DatabaseSize DS
WHERE UTCCollectionDateTime BETWEEN GETUTCDATE()-187 AND GETUTCDATE()-180 
GROUP BY DS.DatabaseID 
)
, OneYear AS
(
SELECT DS.DatabaseID
, MaxUsedMB = Max(DataSizeInKilobytes + TextSizeInKilobytes + IndexSizeInKilobytes)/1024
FROM SQLdmRepository..DatabaseSize DS
WHERE UTCCollectionDateTime BETWEEN GETUTCDATE()-365 AND GETUTCDATE()-358 
GROUP BY DS.DatabaseID 
)
, PreResults AS 
(
SELECT MR.InstanceName
, MR.DatabaseName 
, MR.DataFileSizeMB
, ThisWeek_MaxUsedMB = CAST(W.MaxUsedMB AS INT)
, OneMonth_MaxUsedMB = CAST(M1.MaxUsedMB AS INT)
, ThreeMonth_MaxUsedMB = CAST(M3.MaxUsedMB AS INT)
, SixMonth_MaxUsedMB = CAST(M6.MaxUsedMB AS INT)
, OneYear_MaxUsedMB = CAST(M12.MaxUsedMB AS INT)
, NextGrowthInDays_OneMonth = CAST((MR.DataFileSizeMB-W.MaxUsedMB)/(W.MaxUsedMB - M1.MaxUsedMB + .0001)*30 AS BIGINT)
, NextGrowthInDays_ThreeMonth = CAST((MR.DataFileSizeMB-W.MaxUsedMB)/(W.MaxUsedMB - M3.MaxUsedMB + .0001)*90 AS BIGINT)  
, NextGrowthInDays_SixMonth = CAST((MR.DataFileSizeMB-W.MaxUsedMB)/(W.MaxUsedMB - M6.MaxUsedMB + .0001)*180 AS BIGINT)  
, NextGrowthInDays_OneYear = CAST((MR.DataFileSizeMB-W.MaxUsedMB)/(W.MaxUsedMB - M12.MaxUsedMB + .0001)*360 AS BIGINT)  
, OutOfSpaceInDays_OneMonth = CAST((MR.DataFilePotentialSizeMB-W.MaxUsedMB)/(W.MaxUsedMB - M1.MaxUsedMB + .0001)*30 AS BIGINT)  
, OutOfSpaceInDays_ThreeMonth = CAST((MR.DataFilePotentialSizeMB-W.MaxUsedMB)/(W.MaxUsedMB - M3.MaxUsedMB + .0001)*90 AS BIGINT)  
, OutOfSpaceInDays_SixMonth = CAST((MR.DataFilePotentialSizeMB-W.MaxUsedMB)/(W.MaxUsedMB - M6.MaxUsedMB + .0001)*180 AS BIGINT)  
, OutOfSpaceInDays_OneYear = CAST((MR.DataFilePotentialSizeMB-W.MaxUsedMB)/(W.MaxUsedMB - M12.MaxUsedMB + .0001)*360 AS BIGINT) 
FROM MostRecent MR
INNER JOIN ThisWeek W ON MR.DatabaseID = W.DatabaseID
LEFT JOIN OneMonth M1 ON MR.DatabaseID = M1.DatabaseID AND W.MaxUsedMB > M1.MaxUsedMB
LEFT JOIN ThreeMonth M3 ON MR.DatabaseID = M3.DatabaseID AND W.MaxUsedMB > M3.MaxUsedMB
LEFT JOIN SixMonth M6 ON MR.DatabaseID = M6.DatabaseID AND W.MaxUsedMB > M6.MaxUsedMB
LEFT JOIN OneYear M12 ON MR.DatabaseID = M12.DatabaseID AND W.MaxUsedMB > M12.MaxUsedMB
)
SELECT InstanceName
, DatabaseName 
, NextGrowthInDays_Min = CASE WHEN NextGrowthInDays_OneMonth <= ISNULL(NextGrowthInDays_ThreeMonth , 1000000)
AND NextGrowthInDays_OneMonth <= ISNULL(NextGrowthInDays_SixMonth, 1000000) 
AND NextGrowthInDays_OneMonth <= ISNULL(NextGrowthInDays_OneYear, 1000000)
THEN NextGrowthInDays_OneMonth 
WHEN NextGrowthInDays_ThreeMonth <= ISNULL(NextGrowthInDays_SixMonth, 1000000)
AND NextGrowthInDays_ThreeMonth <= ISNULL(NextGrowthInDays_OneYear, 1000000)
THEN NextGrowthInDays_ThreeMonth
WHEN NextGrowthInDays_SixMonth <= ISNULL(NextGrowthInDays_OneYear, 1000000)
THEN NextGrowthInDays_SixMonth
ELSE NextGrowthInDays_OneYear
END
, OutOfSpaceInDays_Min = CASE WHEN OutOfSpaceInDays_OneMonth <= ISNULL(OutOfSpaceInDays_ThreeMonth, 1000000)
AND OutOfSpaceInDays_OneMonth <= ISNULL(OutOfSpaceInDays_SixMonth, 1000000)
AND OutOfSpaceInDays_OneMonth <= ISNULL(OutOfSpaceInDays_OneYear, 1000000)
THEN OutOfSpaceInDays_OneMonth 
WHEN OutOfSpaceInDays_ThreeMonth <= ISNULL(OutOfSpaceInDays_SixMonth, 1000000)
AND OutOfSpaceInDays_ThreeMonth <= ISNULL(OutOfSpaceInDays_OneYear, 1000000)
THEN OutOfSpaceInDays_ThreeMonth
WHEN OutOfSpaceInDays_SixMonth <= ISNULL(OutOfSpaceInDays_OneYear, 1000000)
THEN OutOfSpaceInDays_SixMonth
ELSE OutOfSpaceInDays_OneYear
END
, DataFileSizeMB
, ThisWeek_MaxUsedMB
, OneMonth_MaxUsedMB 
, ThreeMonth_MaxUsedMB 
, SixMonth_MaxUsedMB
, OneYear_MaxUsedMB
, NextGrowthInDays_OneMonth 
, NextGrowthInDays_ThreeMonth 
, NextGrowthInDays_SixMonth 
, NextGrowthInDays_OneYear 
, OutOfSpaceInDays_OneMonth
, OutOfSpaceInDays_ThreeMonth
, OutOfSpaceInDays_SixMonth
, OutOfSpaceInDays_OneYear 
FROM PreResults
WHERE OutOfSpaceInDays_OneMonth < 100
OR OutOfSpaceInDays_ThreeMonth < 100
OR OutOfSpaceInDays_SixMonth < 100
OR OutOfSpaceInDays_OneYear < 100
OR NextGrowthInDays_OneMonth < 15
OR NextGrowthInDays_ThreeMonth < 15
OR NextGrowthInDays_SixMonth < 15
OR NextGrowthInDays_OneYear < 15
ORDER BY DataFileSizeMB DESC

What this does is look at the trends based on what the database used 30, 90, 180, and 360 days ago to try to predict the future. By “predict the future” I mean it’s not perfect, and can’t be perfect. However, it gives you very good data with a few false positives and very few false negatives.

Speaking of this not being perfect, it double-dips. If there are two databases growing on the same drive with 10 GB free, it says that each one has 10 GB of growth available. I was not able to rewrite this to take that into account while also accounting for databases with multiple files in a single filegroup spread across multiple drives.

Now I’m two weeks ahead of projected file growths and over three months ahead of projected out-of-space errors.

This is very basic information, and I’d have trouble imagining any SQL monitoring software not capturing this in a table you can query.

If I Didn’t Have Idera…

If I didn’t have Idera, or any other monitoring software I could query, I’d be going out at least once a day and pulling the allocated and used space from every file on every server. I may want to do it more often to find peak usage, such as when online index rebuilds are using twice their typical space over the weekends; I take advantage of Idera capturing this hourly. I’ll capture it using something like this:

SELECT @SQL = 
'USE [?]

INSERT INTO DBA..FileSizes (DateAdded, Database_ID, File_ID, Data_Space_ID, SizeMB, UsedMB)
select GETDATE() 
, Database_ID = DB_ID()
, f.File_ID
, f.data_space_id
, SizeMB = ((f.size*8/1024)) 
, UsedMB = (FileProperty(f.name, ''SpaceUsed'')*8/1024) 
from sys.database_files f
'

exec sp_MSforeachdb @SQL

Then I’d need to see how much free space I have on each drive, so I’d go to Ed Wagner’s article Time and Space: How to Monitor Drive Space in SQL Server. Knowing that we use MountPoints in places, I’d have to change his script to use “volume” instead of “logicaldisk” (also mentioned in his comments). This minor change is about the best case scenario you’ll run across.

Remember, the scripts you find on the internet are typically designed to work perfect on the AUTHOR’s system, not yours. It’s not Ed’s fault they don’t use MountPoints, in fact he should probably be congratulated for not making his environment more complex than it needs to be. He should also be congratulated on not making a change to his article that he can’t properly test. At least you know you have to be extra careful testing this change instead of being led to believe the author did something they couldn’t do properly.

After getting the drive’s free space, I’d have to look at the file growth rates to see how large my files could grow. Keep in mind that growth in sys.database_files could be either a percentage or the number of 8kb pages the file will grow by, depending on the value in is_percent_growth. Static growths are easy, just divide by 128 and that’s how many MBs it will grow by. It’s easier (and better for your server) to change percentage growths to MB than it is to look up the logarithmic functions used to calculate compound interest that also apply here.

Now that you have the free space on the drive and the growth rates on the files, you can calculate the maximum size the file can grow to. Use this with trends on the used space to calculate when that size won’t be enough for you.

Make sure you test your results, you’ll have to support it.

So Your Database Is Growing…

Databases get larger, it’s what they do. A normal DBA then goes through and makes sure that there’s room for that growth. A better DBA will do that, too, but then follow it up by figuring out why the database is growing and try to justify it.

I have seen way too many databases out there that don’t purge old data that’s useless to the company. Sometimes the purge was never written, sometimes the code to do that is never being called for several somewhat humorous reasons. Any way you look at it, the data is growing and it’s easy to fall into the trap of just adding space every month, forever.

The way I handle this is by watching my tables and seeing how they’re growing. These are treated just like databases where the largest one isn’t necessarily the growth issue, and looking at historical data to see the growth trend is a better solution.

In the past I wrote a post called Monitoring Database and Table Sizes to get this info yourself. In the future, I’ll write Table Growth Trends – Idera Diagnostic Manager. At the time of this writing, I’m planning on the Table Growth post coming out three weeks after this one. Until then, I put this script up on my Idera Diagnostic Manager Scripts page, which is where I keep these posts organized and where I put my scripts to wait for me to blog about them.