Lately I've been looking for a good way to find Stored Procedures in a SQL Server 2005/2008 database. Sure, there exists tools such as APEX SQL Clean, which does part of the job for you, but unfortunately I can't get them to work exactly the way I want.
This whole quest started when I was trying to identify unused objects in a somewhat large and intricate database I inherited from one of our development partners. I tried the APEX SQL Clean application, but I wasn't really satisfied with the result. I needed an alternate way to solve the task. After Googling for a while I didn't really get the results I wanted so I thought about experimenting with this particular task myself.
For this I've used standard database development and Office tools such as SQL Profiler, SQL Server Management Studio (or osql.exe if you prefer command line) and Excel. I'm currently using SQL Server 2008 (haven't installed the R2 beta yet) and Excel 2010. Although I haven't tested, it should work well with both SQL Server 2005 and Office 2007.
First up I start with the SQL Server Profiler with a slightly changed version of the "SP_Counts" template. After my modifications I have based the template on the "SP:Starting" event with the following columns:
- ObjectName (name of the Stored Procedure)
- Duration (not really needed, I'll explain the purpose later)
- DatabaseName (not really needed since we'll filter on the database anyway)
- SPID (mandatory, I don't want it but Profiler obviously need it)
- StartTime (same as Duration)
- EndTime (same as Duration)
I've added one single filter that filters queries based on the DatabaseName (or DatabaseId. If you prefer that - change accordingly).
Based on this I will get a trace that contains all the Stored Procedures called by one or more applications during the period the trace is running. Do note that I've included the "Duration" field as well. This will help me with solving a secondary object of identifying which Stored Procedures takes the longest to finish.
I let the testers of my team perform their tests on a specific application for about 30 minutes. The result was a trace containing all the invoked Stored Procedures. It depends on the size of the database and application, but for my purpose it was sufficient with about 30 minutes of testing to get enough data in the trace.
All trace data was stored in a local table (local SQL Server instance) for later analysis (this step is important). Naming of the table is not really important. I stored mine as [Trace_2010-02-18] (guess when I ran the trace). With that I bypassed my own quite strict naming convention but it's only a temporary table in a temporary database so I really don't care. *Please don't tell anyone*
Next up is connecting the powerful pivoting capabilities of Excel 2007/2010 with the newly created trace table for the actual analysis.
In Excel, do the following steps:
1. Open Excel 2007/2010
2. On the "Data" ribbon, choose "From Other Sources" and then choose "From SQL Server"
3. Connect to the database where you've stored the trace table. Follow the Wizard; choose the correct database and table and finish.
4. Choose "Table" in the next step. You might want to choose the "PivotTable Report" option directly if you don't want to use the "Duration" column later. I'm going to use some calculated columns for the "Duration" field so I'll stick to "Table". Finish the Wizard.
5. Voila! You have a semi dynamic Excel Table with all the rows from the trace table. Notice that all Stored Procedures has multiple rows in the Excel table. That's because they've been called several times by the application.
6. Add columns to the right with a copy of the "Duration" column and call these "Duration (min)" and "Duration (max)". There may be a better way to do this but it works for now. If you know a better way, please let me know! Omit this step if you don't care about the "Duration" column.
7. Choose "Insert" from the ribbon and choose "PivotTable". Use the default settings unless you have a reason not to.
8. Drag the "ObjectName" field from the "PivotTable Field List" down to the "Row Labels" area and any of the other fields (I chose "SPID" down to the "Values" area. Ensure that the value field is based upon "Count" and not any of the other aggregated values.
9. If you want to look at the "Duration" values you can add the following fields as well: a. Duration, set the field to "Average" value b. Duration (min), set the field to "Minimum" value c. Duration (max), set the field to "Maximum" value.
Okay, first step done. Now you have a dynamic PivotTable that shows you all the Stored Procedures used together with an actual invocation count and how long it took for the stored procedure to complete (minimum, average and maximum time of execution). The latter is quite nice to use when you want to identify which Stored Procedures that takes a long time. Maybe you'd like to optimize some of these.
Next up come the task of adding a view which holds all the Stored Procedures in the database. We will use this list together with the PivotTable to find out which Stored Procedures we can wipe from the database.
1. In SQL Server Management Studio, create a view that consists of the following query:
(type = 'P')
Call this View something such as "DO_DatabaseObjects", "SO_SysObjects", "SPV_StoredProcedureView" (I chose the latter) or whatever your naming convention is.
2. Go back to Excel and create a new worksheet. In this sheet, create a new data connection to the same database that you created your view. Accept the default options and voila, you have a list of all Stored Procedures in the database (used as well as unused).
3. Create a new column next to the "name" column which you call something such as "Is present?"
4. Insert the following formula:
=IF(ISNA(MATCH([@name]; Name of Worksheet that holds the PivotTable!A:A;));"No";"Yes")
Things to do with the formula:
- The "name" field in the formula refers to the "name" column in the Excel table based on the view. If you've changed the output from the view, you have to change the name. Remember to include the square brackets or it won't work.
- The "Name of Worksheet that holds the PivotTable" refers to the, uhm, actual name of the worksheet that holds the PivotTable.
- The A:A bit refers to the column that holds the SP names. Change it if you've included other fields in front of it.
- The "Yes" and "No" values explain themselves quite good I recon. Either the SP is present, or it is not.
Now you should have a complete list of all the Stored Procedures in the database along with a column that says whether or not the respective SP is currently being used based on the workload you're constantly feeding it.
Now that you have a complete list of your unused Stored Procedures it's time to go through your database and delete them. A couple words of warning though: Do note that some Stored Procedures may be in the database for administrative purposes or for reasons unknown (at least to me). You should (if possible) leave the SQL Server Profiler trace on for a good period of time to ensure that you have a workload that actually includes ALL usage scenarios. I'm not responsible if you go on a delete rampage