SQL Server Integration Services (SSIS) is a great tool for building ETL processes. On SQL Server, we can configure an Integration Services Catalog to deploy and configure all the SSIS packages. When using this service, there is also a catalog database named SSISDB that needs to be maintained before we quickly run out of disk space. In this article, we look at the catalog retention policy to better manage the disk space requirements on the SSISDB database.
The SSISDB is a database that holds all the integration service catalog metadata as well as version and execution history. Depending on the frequency of execution of our packages, the database size can grow very quickly.
Fortunately for us, there is a SSIS Server Maintenance Job (SQL Server Agent jobs) that runs every day to clean up and maintain the database. The problem with that job is that it depends on some configuration to enable the cleanup and the retention period which can be 365 days. Depending on our package activity, that retention window can lead our database space to grow in the hundreds of gigabytes.
The SSISDB has a catalog schema which contains the objects that can enable us to manage the catalog configuration. We need to look at the following objects to view and update the configuration:
This is a view for the catalog configuration.
This stored procedure is used to update a configuration setting.
When selecting the information from the view, we may get results similar to the ones on this image:
When we query the view, we need to look at these two settings:
This should be set to TRUE to enable the cleanup of historical data.
This is the amount of dates that are allowed for data retention. If this data is not critical, set it to a low number like 30 days or less.
Change the Settings
To enable this setting and set a low retention window, we can use a stored procedure within the catalog schema. We can run that procedure with our policy requirements. Let’s take a look at how that can be done with T-SQL:
--SET CLEANUP ENABLE TO TRUE exec [catalog].configure_catalog OPERATION_CLEANUP_ENABLED, TRUE --SET THE RETENTION WINDOW TO 30 days exec [catalog].configure_catalog RETENTION_WINDOW, 30 --OPTIONAL RUN THE CLEANUP ON DEMAND OR WAIT FOR SCHEDULE TASK TO RUN EXEC [SSISDB].[internal].[cleanup_server_retention_window]
Partial Cleanup for Large Databases
By setting those fields (cleanup and retention), we can run the stored procedure to clean up the data on demand, or we could also wait for the SQL job to run at its scheduled time and clean up the data for us.
In the event that the database is large, changing the retention window to a very low number (i.e. 365 days to 30 days) in one step may cause the job to eventually fail. For these cases, we need to decrease the retention window in smaller steps. For example, we could write a script that decrements the retention window by one and runs the cleanup procedure as shown here:
--we reduce the retention window by one until we reach the target window of 30 declare @index int = 364, @max int = 30 while @index > @max begin exec [catalog].configure_catalog RETENTION_WINDOW, @index EXEC [SSISDB].[internal].[cleanup_server_retention_window] --shrink the log file as well DBCC SHRINKFILE('log',10) set @index = @index -1 end
The script cleans the data based on the retention window. If the original setting is 365, we set it to a day lower and clean one day at a time. You can play with that settings to see what works for your environment. The goal is to clean up the data in segments that are not too large to avoid getting an error.
If the amount of data is very large, this script may take some time to run. Just let it run and monitor how the retention window decreases with every cycle.
SSISDB like any other database needs to be maintained. Depending on the activity of our SSIS packages, we need to be mindful of the maintenance plan for this database. We need to look at the catalog retention policy to make it compliant with our disk space capacity.
Thanks for reading