SQLServerCentral Article

SSIS Variable and Parameter Analysis

,

Introduction

If you have an SSIS project that's matured over time and makes good use of variables and parameters, you're probably nervous about deleting any of them, even if you think that they are no longer in use. Wouldn't it be great to be sure and then tidy things up with confidence?

I had one such project and decided to tackle the problem in code, after being unable to find anything already written which covers the same ground. BI Developer Extensions looked promising, but I found that it was buggy in this area. It suggested that some of my variables were unused, yet they were definitely being used.

This also gave me the chance to improve my almost non-existent T-SQL XPath skills, with some much-needed help from the community here at SQLServerCentral.

Sample Output

I created a simple SSIS project containing a single package and added a few parameters and variables. The variable, SourceText, is referenced by five other variables. The variable, Splitter3, is referenced in a Script Task, and the project parameter, StrangeProjectParam, is referenced in a Connection Manager. In SSDT, the declarations of these variables and parameters appear as follows:

When I run my T-SQL script against the folder containing the package, I see the following results:

The script returns a list of all of the variables and parameters in the project, together with their design-time values, and adds a calculated  'NumberOfReferences' column.

If NumberOfReferences is zero, no references to the parameter or variable have been found, and it should be safe to delete it. I say 'should' because, despite my testing, SSIS projects can be very complicated things and there may be cases which my code does not detect. If you find such a case, please let me know.

Can I Use It Now?

If you want to jump straight in and try it for yourself, here's what to do.

  1. Determine the user running your SQL Server service (by default, this may be NT Service/MSSQLSERVER)
  2. Make sure that this user has at least Read access to the place where your SSIS project files are stored (because the code analyses files in the file system, not in SSISDB)
  3. Grab a copy of the code (see the later section for the source code)
  4. Modify the row towards the beginning of the script which declares the @FolderPath parameter to be the path to the SSIS project of interest in your own environment. This should be the folder which contains the packages and Project.params files you wish to analyse.
  5. Execute in SSMS

Read on if you are interested in how it works. Otherwise, jump straight to the code section.

Solution Logic

This is a description of how the code works.

  1. Import the XML for all SSIS packages (*.DTSX), project parameters (Project.params) and connection managers (*.conmgr) into a temporary table.
  2. Create a temporary table to hold the results of performing the analysis.
  3. Query the table from (1) above to obtain a list of package variables and parameters, along with their design-time values and expressions and insert the results in the results table.
  4. Update the NumberOfReferences column in the results table, for the package variables and parameters.
  5. Query the table from (1) above to obtain a list of project parameters, along with their design-time values and expressions and insert the results in the results table.
  6. Update the NumberOfReferences column in the results table, for the project parameters, by scanning all of the imported packages and connection managers.
  7. Perform a corrective update on the NumberOfReferences column, to fix the counts in cases where there are multiple variables or parameters which start with the same characters (eg, 'path', 'paths', 'path1', 'path11').
  8. Return the results by selecting from the results table

Techniques Used

Here are a few of the techniques that the code uses.

Importing an XML File to a SQL Server Table

The (unsupported) xp_DirTree extended stored procedure is used to get a list of files to be imported and these are put into a temporary table. A cursor loops round the entries in the temporary table and imports the files' contents (all of which are XML) into another temporary table. using OPENROWSET().

Extracting Variable and Parameter Declaration Information from XML

For me, this was the most challenging part of the process – the extraction of tabular-format data from XML documents. The code relies on two things: a knowledge of XPath and an understanding of the structure of the XML in the document you are pulling data from.

For illustration, I created a very simple package containing a single variable, then edited its XML and stripped it down to include only those parts relevant to the variable itself. It looks like this:

<?xml version="1.0"?>
<DTS:Executable xmlns:DTS="www.microsoft.com/SqlServer/Dts"
  DTS:refId="Package">
  <DTS:Property
    DTS:Name="PackageFormatVersion">8</DTS:Property>
  <DTS:Variables>
    <DTS:Variable
      DTS:Namespace="User"
      DTS:ObjectName="Var1">
      <DTS:VariableValue
        DTS:DataType="3">99</DTS:VariableValue>
    </DTS:Variable>
  </DTS:Variables>
  <DTS:Executables />
</DTS:Executable>

This remains a valid XML document. Here is the code which extracts the variables' declaration information. It's the CROSS APPLY that provides the way to burrow into the XML and extract information from any path containing the Executable/Variables/Variable hierarchy. Looking at the XML above, it should be clear how the variable name, value and namespace are obtained. The variable in my sample package did not have an Expression associated with it.

SELECT PackageName = fl.FileName
      ,ItemType = 'PackageVariable'
      ,ItemNamespace = v.n1.value('(@*:Namespace)[1]', 'varchar(30)')
      ,ItemName = v.n1.value('(@*:ObjectName)[1]', 'varchar(30)')
      ,ItemValue = v.n1.value('(*:VariableValue)[1]', 'varchar(4000)')
      ,ItemExpression = v.n1.value('(@*:Expression)[1]', 'varchar(4000)')
FROM #FileList2 fl
    CROSS APPLY fl.FileXML.nodes('//*:Executable/*:Variables/*:Variable') v(n1)

#FileList2 is a temporary table containing an XML column (containing the XML of packages) called FileXML.

Counting the Number of References to Variables and Parameters

Once you have a list of all of the variables and parameters, along with all of the file contents, as strings in a table, it's easy enough to count the number of occurrences. We do this by replacing the qualified variable/parameter name with an empty string and comparing the length of the string containing the replacements with the length of the original string. The original string is the file's XML, converted to NVARCHAR(MAX).

One complication is where there are multiple variables that start with the same text ('path' and 'path1', for example), which leads to over-counting. There is a separate section towards the end of the code which corrects any over-counting.

Code to Perform Variable and Parameter Analysis

This code was developed on SQL Server 2017 and tested against packages developed in Visual Studio 2017.

A Note on Collapsible Regions

I use an SSMS add-on product called SSMSBoost. One of my favourite features of this product is the ability to define collapsible code regions. I also used the 'hidden' dark theme. In SSMS, with the regions all collapsed, the code for the entire script looks like this:

SSMS query text with regions

 

This is why my code contains --#Region and --#Endregion comment lines – these are interpreted by SSMSBoost.

Source Code

USE tempdb;
GO
/*
SSIS Parameter and Variable Analysis
------------------------------------
Author: Phil Parkin
Date: 5 June 2020
Possible Future Refinements
---------------------------
1) Allow breakdown of project parameters to show in which package/connection they were found and number of occurrences in each
Limitations and Notes
---------------------
1) If parameter values are passed from parent package to child, and those parameters are not subsequently used in the child, 
they will be detected as 'unused' within the child package
2) Tested only on VS2017
3) If there are multiple instances of a package variable with the same name, but multiple scopes, this routine will fail. 
I could probably code round this, if anyone can convince me that this practice makes any sense!
(Violation of PRIMARY KEY constraint ... Cannot insert duplicate key in object 'dbo.#SSISResults')
4) The SQL Server service user (eg, NT SERVICE\MSSQLSERVER) must have read access to the folder containing the package and project parameter files
5) Set the @FolderPath to be path to the folder which contains the packages and project parameters
6) If you have added comments or descriptive text anywhere in your package which includes the qualified name of a variable or parameter (eg, User::VariableName), it will be counted as an occurrence
7) Manipulate the final WHERE clause as desired.
*/SET NOCOUNT ON;
--*** Change the path which appears here to suit your environment ***
DECLARE @FolderPath VARCHAR(250) = 'C:\Temp\SSISTest';
DECLARE @SQL VARCHAR(MAX);
DECLARE @FileName NVARCHAR(260);
DROP TABLE IF EXISTS #FileList;
DROP TABLE IF EXISTS #FileList2;
DROP TABLE IF EXISTS #FileXML;
BEGIN TRY
    --__________________________________________________________________________________________________________________________________
    --#region Import files of interest to temp tables #FileList/#FileList2
    CREATE TABLE #FileList
    (
        FileName NVARCHAR(260) NOT NULL PRIMARY KEY CLUSTERED
       ,depth INT NOT NULL
       ,[file] INT NOT NULL
    );
    CREATE TABLE #FileList2
    (
        FileName NVARCHAR(260) NOT NULL PRIMARY KEY CLUSTERED
       ,Extension VARCHAR(10)
       ,FileXML XML
       ,FileNVarchar NVARCHAR(MAX)
    );
    INSERT #FileList
    (
        FileName
       ,depth
       ,[file]
    )
    EXEC master.sys.xp_dirtree @FolderPath, 1, 1;
    INSERT #FileList2
    (
        FileName
    )
    SELECT fl.FileName
    FROM #FileList fl;
    DROP TABLE IF EXISTS #FileList;
    UPDATE fl
    SET fl.Extension = (CASE
                            WHEN fl.FileName LIKE '' THEN
                                REVERSE(LEFT(rev.NameFile, CHARINDEX('.', rev.NameFile) - 1))
                            ELSE
                                ''
                        END
                       )
    FROM #FileList2 fl
        CROSS APPLY
    (SELECT NameFile = REVERSE(fl.FileName)) rev;
    DELETE #FileList2
    WHERE Extension NOT IN ('dtsx', 'params', 'conmgr');
    DECLARE @NumFiles INT =
            (
                SELECT COUNT(*) as Computed  FROM #FileList2 fl2
            );
    IF @NumFiles = 0
        THROW 52000, 'No SSIS files found. Please verify the value of @FolderPath and that the SQL Server service account has access to that folder path.', 1;
    CREATE TABLE #FileXML
    (
        FileXML XML NOT NULL
    );
    DECLARE files CURSOR LOCAL FAST_FORWARD FOR
    SELECT fl.FileName
    FROM #FileList2 fl;
    OPEN files;
    FETCH NEXT FROM files
    INTO @FileName;
    WHILE @@FETCH_STATUS = 0
    BEGIN
        DELETE #FileXML;
        SET @SQL
            = CONCAT(
                        'SELECT BulkColumn = CONVERT(XML, BulkColumn) FROM OPENROWSET(BULK '''
                       ,@FolderPath
                       ,'/'
                       ,@FileName
                       ,'''
                   ,SINGLE_BLOB
              ) x;'
                    );
        INSERT #FileXML
        (
            FileXML
        )
        EXEC (@SQL);
        UPDATE fl
        SET fl.FileXML = fx.FileXML
           ,fl.FileNVarchar = CAST(fx.FileXML AS NVARCHAR(MAX))
        FROM #FileList2 fl
            CROSS JOIN #FileXML fx
        WHERE fl.FileName = @FileName;
        FETCH NEXT FROM files
        INTO @FileName;
    END;
    CLOSE files;
    DEALLOCATE files;
    --#endregion Import files of interest to temp tables #FileList/#FileList2
    --__________________________________________________________________________________________________________________________________
    --#region Create temp table to hold the results of the analysis
    DROP TABLE IF EXISTS #SSISResults;
    CREATE TABLE #SSISResults
    (
        PackageName VARCHAR(100) NOT NULL
       ,ItemType VARCHAR(50) NOT NULL
       ,ItemNamespace VARCHAR(50) NOT NULL
       ,ItemName VARCHAR(100) NOT NULL
       ,ItemValue VARCHAR(4000) NULL
       ,ItemExpression VARCHAR(4000) NULL
       ,NumberOfReferences INT NULL
            PRIMARY KEY CLUSTERED (
                                      PackageName
                                     ,ItemType
                                     ,ItemNamespace
                                     ,ItemName
                                  )
    );
    --#endregion Create temp table to hold the results of the analysis
    --__________________________________________________________________________________________________________________________________
    --#region Extract the package variables and parameters first
    INSERT #SSISResults
    (
        PackageName
       ,ItemType
       ,ItemNamespace
       ,ItemName
       ,ItemValue
       ,ItemExpression
    )
    SELECT PackageName = fl.FileName
          ,ItemType = 'Package Parameter'
          ,ItemNamespace = '$Package'
          ,ItemName = p.n1.value('(@*:ObjectName)[1]', 'varchar(30)')
          ,ItemValue = p.n1.value('(*:Property)[1]', 'varchar(4000)')
          ,ItemExpression = NULL
    FROM #FileList2 fl
        CROSS APPLY fl.FileXML.nodes('//*:Executable/*:PackageParameters/*:PackageParameter') p(n1)
    WHERE fl.Extension = 'dtsx'
    UNION ALL
    SELECT PackageName = fl.FileName
          ,ItemType = 'PackageVariable'
          ,ItemNamespace = v.n1.value('(@*:Namespace)[1]', 'varchar(30)')
          ,ItemName = v.n1.value('(@*:ObjectName)[1]', 'varchar(30)')
          ,ItemValue = v.n1.value('(*:VariableValue)[1]', 'varchar(4000)')
          ,ItemExpression = v.n1.value('(@*:Expression)[1]', 'varchar(4000)')
    FROM #FileList2 fl
        CROSS APPLY fl.FileXML.nodes('//*:Executable/*:Variables/*:Variable') v(n1)
    WHERE fl.Extension = 'dtsx';
    --#endregion Extract the package variables and parameters first
    --__________________________________________________________________________________________________________________________________
    --#region Calculate the number of references to the package variables and parameters
    UPDATE res
    SET res.NumberOfReferences = (LEN(fl.FileNVarchar) - LEN(REPLACE(cast(cast(cast(fl.FileNVarchar as nvarchar(max)) as nvarchar(max)) as nvarchar(max)),cast(cast(cast( calc1.ReplaceString as nvarchar(max)) as nvarchar(max)) as nvarchar(max)),cast(cast(cast( '' as nvarchar(max as nvarchar(max as nvarchar(max)))))))))
                                 / LEN(calc1.ReplaceString)
    FROM #FileList2 fl
        JOIN #SSISResults res
            ON fl.FileName = res.PackageName
        CROSS APPLY
    (
        SELECT ReplaceString = CONCAT(res.ItemNamespace, '::', res.ItemName)
    ) calc1;
    --#endregion Calculate the number of references to the package variables and parameters
    --__________________________________________________________________________________________________________________________________
    --#region Now add the project parameters
    INSERT #SSISResults
    (
        PackageName
       ,ItemType
       ,ItemNamespace
       ,ItemName
       ,ItemValue
    )
    SELECT PackageName = 'Project'
          ,ItemType = 'Project Parameter'
          ,ItemNamespace = '$Project'
          ,ItemName = p.n1.value('(@*:Name)[1]', 'varchar(30)')
          ,ItemValue = q.n1.value('(.)[1]', 'varchar(4000)')
    FROM #FileList2 fl
        CROSS APPLY fl.FileXML.nodes('//*:Parameters/*:Parameter') p(n1)
        CROSS APPLY p.n1.nodes('*:Properties/*:Property[@*:Name="Value"]') q(n1)
    WHERE fl.Extension = 'params';
    --#endregion Now add the project parameters
    --__________________________________________________________________________________________________________________________________
    --#region Calculate the total number of references to the project parameters, across all packages
    WITH ParamTotals
    AS (SELECT sr.PackageName
              ,sr.ItemType
              ,sr.ItemNamespace
              ,sr.ItemName
              ,TotalNumberOfReferences = SUM(NumRefs.NumberOfReferences)
        FROM #SSISResults sr
            CROSS APPLY
        (
            SELECT ReplaceString = CONCAT(sr.ItemNamespace, '::', sr.ItemName)
        ) calc1
            CROSS APPLY
        (
            SELECT NumberOfReferences = (LEN(fl.FileNVarchar) - LEN(REPLACE(cast(cast(cast(fl.FileNVarchar as nvarchar(max)) as nvarchar(max)) as nvarchar(max)),cast(cast(cast( calc1.ReplaceString as nvarchar(max)) as nvarchar(max)) as nvarchar(max)),cast(cast(cast( '' as nvarchar(max as nvarchar(max as nvarchar(max)))))))))
                                        / LEN(calc1.ReplaceString)
            FROM #FileList2 fl
            WHERE fl.Extension = 'dtsx'
                  OR fl.Extension = 'conmgr'
        ) NumRefs
        WHERE sr.ItemType = 'Project Parameter'
        GROUP BY sr.PackageName
                ,sr.ItemType
                ,sr.ItemNamespace
                ,sr.ItemName)
    UPDATE sr
    SET sr.NumberOfReferences = t.TotalNumberOfReferences
    FROM #SSISResults sr
        JOIN ParamTotals t
            ON t.PackageName = sr.PackageName
               AND t.ItemType = sr.ItemType
               AND t.ItemNamespace = sr.ItemNamespace
               AND t.ItemName = sr.ItemName;
    --#endregion Calculate the total number of references to the project parameters, across all packages
    --__________________________________________________________________________________________________________________________________
    --#region Correct the calculated number of references, as needed
    --Where there are multiple items which begin with the same text (eg, path, paths, path1, path11, ...), the item-counting technique above 
    --does not work correctly (every occurrence of user::path1 would also be counted as an occurrence of user::path, for example)
    WITH OverCounted
    AS (SELECT sr.PackageName
              ,sr.ItemType
              ,sr.ItemNamespace
              ,sr.ItemName
              ,sr.NumberOfReferences
        FROM #SSISResults sr
        WHERE EXISTS
        (
            SELECT 1
            FROM #SSISResults sr2
            WHERE sr.PackageName = sr2.PackageName
                  AND sr.ItemType = sr2.ItemType
                  AND sr.ItemNamespace = sr2.ItemNamespace
                  AND sr.ItemName <> sr2.ItemName
                  AND sr.ItemName = LEFT(sr2.ItemName, LEN(sr.ItemName))
        ))
        ,Deductions
    AS (SELECT OverCounted.PackageName
              ,OverCounted.ItemType
              ,OverCounted.ItemNamespace
              ,OverCounted.ItemName
              ,Deduction = SUM(sr3.NumberOfReferences)
        FROM OverCounted
            JOIN #SSISResults sr3
                ON sr3.PackageName = OverCounted.PackageName
                   AND sr3.ItemType = OverCounted.ItemType
                   AND sr3.ItemNamespace = OverCounted.ItemNamespace
                   AND LEFT(sr3.ItemName, LEN(OverCounted.ItemName)) = OverCounted.ItemName
                   AND sr3.ItemName <> OverCounted.ItemName
        GROUP BY OverCounted.PackageName
                ,OverCounted.ItemType
                ,OverCounted.ItemNamespace
                ,OverCounted.ItemName)
    UPDATE sr4
    SET sr4.NumberOfReferences = sr4.NumberOfReferences - d.Deduction
    FROM #SSISResults sr4
        JOIN Deductions d
            ON d.PackageName = sr4.PackageName
               AND d.ItemType = sr4.ItemType
               AND d.ItemNamespace = sr4.ItemNamespace
               AND d.ItemName = sr4.ItemName;
    --#endregion Correct the calculated number of references, as needed
    --__________________________________________________________________________________________________________________________________
    SELECT *
    FROM #SSISResults sr
    --WHERE sr.NumberOfReferences = 0
    ;
END TRY
BEGIN CATCH
    THROW;
END CATCH;

Rate

5 (4)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (4)

You rated this post out of 5. Change rating