Technical Article

Capturing Baselines on SQL Server: Where's My Space?


Storage is an integral part of a SQL Server installation. We need to make sure that the capacity and performance of the storage subsystem match our requirements in terms of the volume of data we need to store, over time, and the speed with which we need to access it.

In this article, we'll tackle the topic of monitoring disk space usage. By tracking how much is in use and how much is still available, over time we'll have the data we need for better capacity planning, and can ensure that a database won't ever run out of disk space. It is easy to find out how much space is free on a drive right now, but trending that information requires foresight and a bit of data capture. Fortunately, both are easy to do in SQL Server, as we'll see here.

In order to work through this article, you'll need to create a database in which baseline information can be stored. My previous post, Back to Basics: Capturing Baselines on Production SQL Servers, created one called BaselineData, and all the scripts included in this post assume this database exists, and that you're using SQL Server 2005 or higher.

The Data to Capture

In order to understand how much space databases are using, and how much is free, we need to capture file information for data and log files from either the sys.database_files or sys.master_files catalog views. If you're using SQL Server 2012, then you can also get file size information for all databases from the sys.dm_db_file_space_usage DMV. However, for SQL Server 2005 through SQL Server 2008R2, it provides this information only for the tempdb database.

Use of sys.master_files is preferable for capturing file information, as it is a system-wide view and it means we don't need to roll through each database, querying sys.database_files to retrieve information. However, neither of these views list how much of the file is in use, only the size of the file.

It is a good practice to size database files in advance, and grow them a manually, as necessary. This helps avoid the performance hit when files grow automatically, particularly during periods when the server is busy. As such, it is essential the DBA knows how much space is in use within a database file in order to determine when to grow a file manually, and therefore prevent auto-growth.

File growth and instant file initialization

Enabling instant file initialization reduces the cost of growth events for data files. However, log files cannot utilize this feature.

We can capture space usage within a file using the FILEPROPERTY function, but FILEPROPERTY is database-scoped, so when we call the function and pass in the logical file name, from sys.master_files, we get the usage information for only the local database. Ultimately, we end up querying sys.database_files for each database.

Storing File Information

Listing 1 creates a base table, FileInfo, in which to store our physical file statistics history. Keep in mind that, over time, this table can become very large, depending on the number of databases and files, along with the frequency of collection. DBAs should develop a method to purge this table on a regular basis.

USE [BaselineData];
            FROM    [sys].[tables]
            WHERE   [name] = N'FileInfo' ) 
    DROP TABLE [dbo].[FileInfo]
CREATE TABLE [dbo].[FileInfo]
      [DatabaseName] SYSNAME NOT NULL ,
      [FileID] INT NOT NULL ,
      [Type] TINYINT NOT NULL ,
      [DriveLetter] NVARCHAR(1) NULL ,
      [LogicalFileName] SYSNAME NOT NULL ,
      [PhysicalFileName] NVARCHAR(260) NOT NULL ,
      [SizeMB] DECIMAL(38, 2) NULL ,
      [SpaceUsedMB] DECIMAL(38, 2) NULL ,
      [FreeSpaceMB] DECIMAL(38, 2) NULL ,
      [MaxSize] DECIMAL(38, 2) NULL ,
      [IsPercentGrowth] BIT NULL ,
      [Growth] DECIMAL(38, 2) NULL ,
      [CaptureDate] DATETIME NOT NULL

Listing 1: Creating the FileInfo table

Please note that the FileInfo table is not indexed. DBAs are encouraged to create the indexes that will best support the queries they wish to run against this table (see later).

Capturing the Baseline File Data

Now we're ready to collect the file statistics, including the space used in each file. As discussed earlier, our query interrogates sys.database_files for each database, and uses the FILEPROPERTY function to capture the amount of space used within the files for that database. In this example, we use a cursor to iterate through all online databases for an instance.

Note that this query will not return data for any database that is not online (e.g. has a status of RECOVERING, OFFLINE, or SUSPECT). This is accomplished by loading only the databases from sys.databases with a value of 0 for state.

USE [BaselineData];
    FROM    [sys].[databases]
    WHERE   [state] = 0
    ORDER BY [name];
    OPEN DBCursor;
            SET @sqlstring = N'USE ' + @DBName + '
      ; INSERT [BaselineData2012].[dbo].[FileInfo] (
      SELECT ''' + @DBName
                + ''' 
      CAST( as DECIMAL(38,0))/128., 
      CAST(FILEPROPERTY([name],''SpaceUsed'') AS DECIMAL(38,0))/128., 
      (CAST( as DECIMAL(38,0))/128) - (CAST(FILEPROPERTY([name],''SpaceUsed'') AS DECIMAL(38,0))/128.),
      FROM ' + @DBName + '.[sys].[database_files];'
            EXEC (@sqlstring)
            FETCH NEXT FROM DBCursor INTO @DBName;
    CLOSE DBCursor;

Listing 2: Capturing file statistics for all database on an instance

Run this query on a regular basis, via a scheduled job, to capture successive snapshots of current file information. Schedule the job according to how often you have observed your data and log files growing. If you don't have that information, interrogate the default trace in SQL Server to find out often auto-growth has occurred (see, for example,

Querying the Historical File Data

We can slice and dice this data in many ways, depending on what is of interest to the DBA. For example, Listing 3 returns the current amount of free space for all files on a SQL Server instance, with those with the least amount of free space listed first.

USE [BaselineData];
SELECT  [DatabaseName] ,
        [FileID] ,
        [DriveLetter] ,
        [LogicalFileName] ,
        [PhysicalFileName] ,
        [SizeMB] ,
        [SpaceUsedMB] ,
        [FreeSpaceMB] ,
        CAST([FreeSpaceMB] / [SizeMB] * 100 AS DECIMAL(38, 2)) AS 'PercentFree' ,
        CASE WHEN [MaxSize] = 0
             THEN 'Max size = ' + CAST([SizeMB] / 128.00 AS VARCHAR(15))
                  + ' MB'
             WHEN [MaxSize] = -1 THEN 'No max size set'
             WHEN [MaxSize] = 268435456 THEN 'Max size = 2 TB'
             ELSE 'Max size = '
                  + CAST(CAST([MaxSize] / 128.00 AS DECIMAL(38, 2)) AS VARCHAR(15))
                  + ' MB'
        END AS "MaximumSize" ,
        CASE WHEN [Growth] = 0 THEN 'No growth'
             WHEN [Growth] > 0
                  AND [IsPercentGrowth] = 1
             THEN CAST([Growth] AS VARCHAR(15)) + '%'
             WHEN [Growth] > 0
                  AND [IsPercentGrowth] = 0
             THEN CAST(CAST([Growth] / 128.00 AS DECIMAL(38, 2))  AS VARCHAR(15))
                  + ' MB'
        END AS 'AutoGrowth' ,
FROM    [dbo].[FileInfo]
WHERE   CONVERT(VARCHAR(10), [CaptureDate], 112) = CONVERT(VARCHAR(10), GETDATE(), 112)
ORDER BY [PercentFree] ASC

Listing 3: Which database files on an instance have the least free space?

This base query only returns rows captured for the current date. However, it is easy to modify the predicate and ORDER BY clause for this query, as necessary. For example, to trend the size and space used for a database's files over time, simply amend the WHERE and ORDER BY clauses as follows:

WHERE [DatabaseName] = N'[AdventureWorks]'
ORDER BY [FileID], [CaptureDate];

We can track total database size simply by summing size of the data and log files for a database, over time.

SELECT  [DatabaseName] ,
        SUM([SizeMB]) AS 'DatabaseSizeMB' ,
        SUM([SpaceUsedMB]) AS 'SpaceUsedMB' ,
        SUM([FreeSpaceMB]) AS 'FreeSpaceMB' ,
        CAST(SUM([FreeSpaceMB]) / SUM([SizeMB]) * 100 AS DECIMAL(38, 2)) AS 'PercentFree' ,
FROM    [dbo].[FileInfo]
WHERE   [DatabaseName] = N'[AdventureWorks]'
GROUP BY [DatabaseName] ,
ORDER BY [CaptureDate] DESC;

Listing 4: Tracking total database size

If you'd prefer to track only the total size of all data files for each database, or want to track total log file size separately, you may wish to exclude log files from the query, as shown in Listing 5.

SELECT  [DatabaseName] ,
        SUM([SizeMB]) AS 'DatabaseSizeMB' ,
        SUM([SpaceUsedMB]) AS 'SpaceUsedMB' ,
        SUM([FreeSpaceMB]) AS 'FreeSpaceMB' ,
        CAST(SUM([FreeSpaceMB]) / SUM([SizeMB]) * 100 AS DECIMAL(38, 2)) AS 'PercentFree' ,
FROM    [dbo].[FileInfo]
WHERE   [DatabaseName] = N'[AdventureWorks]'
        AND [Type] <> 1
GROUP BY [DatabaseName] ,
ORDER BY [CaptureDate] DESC;

Listing 5: Tracking total database size, excluding log files

We can monitor the total amount of space used by database files per drive, an important metric for a DBA. Listing 6 returns this data for the current date. If you happen to capture this information more than once on a single day, your results will be incorrect.

SELECT  [DriveLetter] ,
FROM    [dbo].[FileInfo]
WHERE   CONVERT(VARCHAR(10), [CaptureDate], 112) = CONVERT(VARCHAR(10), GETDATE(), 112)
GROUP BY [DriveLetter];

Listing 6: Total size of all database files, per drive

Knowing the amount of cumulative space consumed by the database files is only half the battle. We also need to know how much free space exists on the drives where the files are stored.

In the past, we've had to use the undocumented extended stored procedure, xp_fixeddrives to capture this data. However, Microsoft has deprecated extended stored procedures and xp_fixeddrives does not work with mounted volumes. DBAs running SQL Server 2008R2 SP1 and higher can use the sys.dm_os_volume_stats DMV to capture the same information:

        ( [vs].[logical_volume_name] ) AS 'Drive' ,
        [vs].[available_bytes] / 1048576 AS 'MBFree'
FROM    [sys].[master_files] AS f
        CROSS APPLY [sys].[dm_os_volume_stats]([f].[database_id],
                                               [f].[file_id]) AS vs
ORDER BY [vs].[logical_volume_name];

Listing 7: Using sys.dm_os_volume_stats can to capture free space per drive

We can log this information to a table with the same frequency as the database file information. DBAs can take this a step further by querying the output table and sending an email to relevant parties when free space drops below a pre-defined threshold.

Final Notes

It is essential that DBAs monitor file size and drive usage, and these scripts provide a foundation from which any DBA can work to establish a set of baselines. The article provided one query to capture the data, and demonstrated a few ways to analyze it, but all the queries are easy to adapt as required. The main goal is to determine what's needed for a given environment and build from there.

As noted, there is additional work to be done before you implement these in your environment; DBAs who wish to capture and analyze this data will need to create appropriate indexes on the FileInfo table, develop a method to purge the table on a regular basis, and include it in a regular index maintenance plan.

The next and final article in this series on capturing baseline data will show how to trend wait statistics over time.



You rated this post out of 5. Change rating




You rated this post out of 5. Change rating