Blog Post

Spring may be long gone but you still need to clean up MSDB

,

computer-recycling-electronic-wasteSystem databases are often neglected, part of this is due to Microsoft’s message of SQL Server will run and look after itself. In fairness it does to a certain extent (no, that’s not a pun), it’s configured out of the box not to be the fastest, but to be stable. The problem is that SQL Server is a bit messy and doesn’t like to clean up after itself. It knows right from wrong and has a few life skills, but tidiness is not one of them. In this regard I would relate it to a teenager.

So, here we are with a room full of mess left by our teenager with no hope of them ever cleaning it up on their own. They need us to set some ground rules and then they will do it (hopefully).

In this scenario it’s the SQL Server Agent that’s the teenager and MSDB that’s the bedroom. The SQL Agent has been off doing it’s own sweet thing and treating the place like a hotel, to be fair it has been starting jobs and reporting on them, logging lots of information about backups, restores and a whole host of other things. Problem is that it’s left that information all over the floor and navigating to the other side of the room to put something new away takes longer and longer each time because of the mess.

Setting out your stall

Let’s take a look at one instance from my lab environment on my laptop that I keep as a mess on purpose and uses all the standard configuration settings:

If I run the following code I see that MSDB doesn’t have auto close or auto shrink on and it has auto create stats and auto update stats  on– Yay Microsoft for these are good things! There are edge cases but setting auto close or auto shrink on are bad things and bad things will happen to you. Namely me finding you and shutting your fingers in a draw then making you type out a new resume.

What I do notice is that MSDB by default is in the simple recovery model. Now if this is a production environment I’m going to want to change that. It could be business critical to know exactly which jobs and backups have run when I come to bring something back online. There’s potential that some data may be lost if MSDB is in the simple recovery model.

 

select
name, 
create_date, 
compatibility_level, 
recovery_model_desc, 
page_verify_option_desc, 
is_auto_close_on, 
is_auto_shrink_on, 
is_auto_create_stats_on,
is_auto_update_stats_on
from sys.databases 
where name = 'msdb';

 

image

Does your environment look similar?

 

Big isn’t always beautiful

Next we’re going to line up, show each other, and discuss who has the biggest. I am of course talking about the size of the MSDB database, now in this instance (yes, that was a pun) the environment is on a laptop and running as a virtual machine so it’s not running 24/7 so I’m not worried that mine isn’t the biggest in this particular competition. Are you unsure how to measure how big yours is? Well there are a couple of different ways, for this example I’m going to use one of the built in reports in SQL Server Management Studio.

Using the Disk Usage Report

To open the Disk Usage report perform the following steps, I’m assuming some familiarity here:

  1. Navigate to your MSDB Database in Object Explorer
  2. Right Click on MSDB
  3. Choose Reports
  4. Choose Standard Reports
  5. Choose Disk Usage

 

image

 

So there’s a couple of points of interest here.

  1. I’ve left the regional settings as US rather than UK which means dates are really going to confuse me at some point.
  2. I have one auto growth event. Now I know I have had more than one as this database is much larger than other MSDB databases in the same lab and you can check how much the database files and log files are set to grow in Properties when you right click on the database. This growth information comes from something called the default trace, unfortunately I don’t have the full history as it rolls over. If you want to keep a full history then you are going to want to capture these events. If the database does have growth events in there then it has not been sized correctly, you need to take stock of the information you have and plan to grow your databases to the correct size at an appropriate time that will not affect your users.
  3. The growth took 120ms to grow 4MB, depending on how long this takes it may indicate that your instance does not have instant file initialization enabled. You might want to look into that.
  4. MSDB has space allocated to indexes, interesting as most people I know only defragment the key indexes of their user databases. Do you defragment your system database indexes?

 

Now check the same report on your instance to see what you can spot, looking good?

 

What’s taking up all the space in MSDB

The Disk Usage report isn’t the only report that ships with SQL Server, this time we’re going to look at Disk Usage by Top Tables to see which tables are taking up the most space.

To open the Disk Usage by Top Tables report perform the following steps, I’m assuming some familiarity here:

  1. Navigate to your MSDB Database in Object Explorer
  2. Right Click on MSDB
  3. Choose Reports
  4. Choose Standard Reports
  5. Choose Disk Usage by Top Tables

 

SNAGHTML9eb5f52

 

From this I can see that the majority of the data is associated with backups, jobs and some log shipping thrown in for good measure. The question now is do I need it all?

 

Clearing up the mess

So, how do I clear down the backup and restore history that I no longer need. Chances are I won’t need backup information from a year ago and If I do I may be able to use an old copy of MSDB to read the data from. Choose the retention period that is best for your environment and then set up a job to regularly run the procedure sp_delete_backuphistory this procedure takes a date parameter that you can use to delete all backup associated data prior to that date.

A note of caution

There is another system store procedure which is scarily similar to the sp mentioned above called sp_delete_database_backuphistory the difference being that this one takes a database name as a parameter and then deletes all the backup history for that database. This would be bad, except for one occasion – when you want to delete orphaned information from dropped databases.

The following code will help you to find orphaned database information:

SELECT 
BS.database_name, 
COUNT(BS.backup_set_id) as 'Orphans'
FROM   msdb.dbo.backupset BS
LEFT OUTER JOIN master.dbo.sysdatabases SD ON SD.name = BS.database_name
WHERE sd.name IS NULL
GROUP BY BS.database_name

A few other procedures of note that you might like to look at:

  • sp_delete_jobsteplog is used to maintain the sysjobstepslogs table in the msdb database
  • sp_purge_jobhistory will clear job history for an individual job based upon a date parameter, if no job name or job id is specified all jobs will be affected. This can be useful when you want to have different retention periods for different jobs.

 

Your action point

Take a look at the system stored procedures that I have mentioned in this post and work with your business to find the best retention periods for your business cycles. Once you have done that create some jobs to automate this for you and don’t forget to add this to your server build list!

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating