Blog Post

A Tale of SQL Server Disk Space Trials and Tribulations

,

#SQLNewBlogger

Details:

System: 5.5GB of RAM, Windows 2008 R2, vmware, 4 cores, SQL Server 2012 Standard Edition. The database currently has the following data files: two files in the PRIMARY filegroup, and one other file in another filegroup, plus one log file. It used to have two log files…

Current space consumed on disk before operations: 422GB

Current space consumed on disk AFTER operations: 124GB

sp_helpdb before and immediately after:

Before
Before
After
After

Backup time before operations: 60 minutes striping across four files.

Backup time AFTER operations: 4 minutes striping across four files.

One of the perennial challenges database administrators (DBAs) face everyday is dealing with multiple copies of the same database living on several different servers. If the gold standard (production) has disk space inefficiencies in it that usually gets multiplied by every copy of that database throughout the environment.

I swear they are multiplying like rabbits! Oh wait...
I swear they are multiplying like rabbits! Oh wait…

It came to my attention by higher pay grades that we had to get a handle on this disk space issue for this one database because we could not afford more SAN space at this time. This was one of our larger databases and had been around for a while under different developer teams and dba management too.

Previous Efforts:

The largest tables were recently archived out of the database and only the most recent stuff in those tables remained. Several old tables and indexes were identified and deleted altogether. Transactional replication was then set-up to keep data in sync for reporting purposes to another system. These two efforts resulted in a significant amount of free space internal to the database but not available to the operating system (OS). A surge for the virtual machine (vm) was requested to temporarily increase CPU, memory and move disks to the fastest disks on the SAN and also request a temporary increase in the amount of available space for file movements.

DBCC SHRINKFILE had taken four hours and only returned 30GB of space back to the OS. DBCC SHRINKDATABASE had been run but only generated a lot of transaction log activity after a few hours; no space was returned back to OS. The DBA team was given the go ahead to get more aggressive and come up with another plan to reduce disk space. Another point: the existing index defrag job was mainly running reorganize and not rebuild for quite some time. So it probably was a good time to re-build ALL indexes which will also freshen up all of the index related statistics on those tables.

Limitations:

Approximately a four hour window to work in. Option is open to restrict access to database if needed.

Objectives:

Free disk space back to the OS. Set things up so as to create a baseline for further performance tuning.

With all of the easy buttons pushed it was time to get dirty. Based on their passion and love of talking about shrinking database files (Brent Ozar: Stop Shrinking Your Database Files. Seriously. Now. bit.ly/1bxolac) and Paul Randal (Why you should not shrink your data files bit.ly/1OZrdcF) I knew this was going to get ugly. I had been down this road before so I knew what was ahead and I didn’t like it but jumped in and attacked it.

Here is the strategy I wanted to use: I needed to move ALL tables and indexes out of their current filegroups and into a brand new filegroup and file. Then I can shrink the empty PRIMARY filegroup once it is empty of user created objects, same for the other filegroup, then delete any extra files from the database and clean things up.

Questions: 1) How do you move a table from one filegroup to another when it has a clustered index (CI) on the primary key (PK) and 2) How do you move a heap from one filegroup to another without dropping and re-creating the table?

Answer to 1: Script out both create and drop statements (or use the elegant CREATE INDEX DROP_EXISTING = ON and specify the new filegroup name) on all of the all nonclustered indexes (NCI), script out and drop all foreign key (FK) references to the table, drop the PK (which effectively drops the CI), re-create the PK/CI specifying the new filegroup, add the FKs, and add the NCI(s) back in specifying the new filegroup. Do this for EACH table. Take your time and double check your work.

Answer to 2: Create a temporary CI on the heap and specify the new filegroup then drop the CI. This effectively moves the table and also does two other things: a) removes forwarded records and b) eliminates fragmentation of the heap. Script out NCI(s) too for the new filegroup.

Scripts for the job:

@AdamMachanic

sp_whoisactive (bit.ly/1zgEZpT) <- mainly to monitor blocking and observe waits

script_table_keys.sql (bit.ly/1EfDfMm) <- be careful how it handles NCI(s)

@pinaldave

sql-server-list-all-objects-created-on-all-filegroups-in-database (bit.ly/1OZtp3Z)

Monitor log space usage during operations with DBCC SQLPERF(LOGSPACE); and backup t-log when needed, and also frequently monitor the SQL ERRORLOG.

Manual script time:

Create a new filegroup and then create a new file for that filegroup. Size it accordingly but let it auto-grow in small, fixed increments. The scripts I put together do things table by table. That way I can stop at anytime if needed. Using the script from Pinal Dave, sorted by ASC so list matches SQL Management Studio. Expand indexes and match up output from Adam’s. Script out indexes together and do the same for re-creating the NCI(s) on the new filegroup.

For a reason that no one could remember, the fillfactor was changed across all NCI(s). Without evidence of page splits causing performance problems and seeing how we are having disk space issues, I made the decision to set it back to 100% from its 95% setting. If we need to change it later we can. Note on SQL Server Standard Edition; no on-line index re-build option is available. Be aware that on a heavily used system a re-build will cause blocking! We had also debated about changing the indexes to SORT_IN_TEMPDB but decided against it as the current but old re-indexing job could be affected.

Gotchas

Spaces in object names. Argh! Remember from above the database also had two files in the PRIMARY filegroup. I had to be sure all user-created objects were out of it before I could do things with it. Some of the FK(s) didn’t make sense to me but I re-created them as-is anyway. The syntax at times looked like it had been created by some kind of ORM tool.

Success!

After running the scripts table by table, I was then able to run a DBCC SHRINKFILE( file_name, EMPTYFILE) on the second file of the PRIMARY filegroup then drop it. Ran SHRINKFILE on remaining file from other filegroup. Instant space savings back to the OS! Delete old files.

Lessons Learned

Getting support from the business and an outage window is critical. Freeing up a ton of disk space can make people look bad so be sensitive and mindful of that fact. Test and re-test scripts on DEV/QA system. Get replication set-up to mirror production and test behavior on the handful of replicated tables.

Summary of Results

All tables have been re-built as a result of specifying CI on a new filegroup.

All NCI indexes have been re-built in the proper order: CI first then NCI(s).

All forwarded records and heap fragmentation have been eliminated.

All index related stats have been rebuilt with FULLSCAN as a direct result of re-building the tables and indexes on the new filegroup.

Database is now roughly 75% smaller on disk than before operations.

Final Act

Be sure to manually increase size of the data file by a modest amount to leave room for growth so as to be proactive and prevent a filegrowth event. Run DBCC CHECKDB w/ extended checks then backup the database and test restore.

These trials and tribulations are not easy to do and they are time-consuming. Scripts can and do become obsolete between deployments so heads-up and keep your checklist flexible.

Thanks again to @eleightondick for the #SQLNewBlogger challenge!

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating