Stairway to Row-Level Security

Stairway to ScriptDOM Level 1 - An Introduction to ScriptDOM

,

Introduction

Automated checking of source code for programmatic and stylistic errors is called ‘Linting’. Tools that help with this are called ‘Linting tools’. A linting tool is basically a static code analyzer.

Last year, I researched some linting tools for the large T-SQL code base we have at work. I looked into several tools, both open-sourced and paid. One tool had a rule that called out queries that used more than three tables as a bad practice. We had several stored procedures with queries 6-7 tables or more and performed well. This rule would not work for us. Another tool had a rule that pointed out the lack of indexes on temp tables as a problem. The rule was designed for table scripts but worked the same on code.

Getting new rules that we wanted, such as not having unnamed primary keys on temp tables or enforcing our naming standards- meant extra work on the code. Our rules were custom to our environment. There were generic best practices for sure, but those were a small subset of what we needed.

I then started looking for a tool to make a custom linter. That is when I discovered ScriptDOM, which has been around for a long time, when it was part of the .NET framework. However, few people know about it or use it.

It took me some time to understand how to put this to use. But after I figured it out, it was straightforward. Now I have a reasonably robust, custom linter written in PowerShell and integrated well into our Azure DevOps Build process. It is easy to use, and we own it.

About This Stairway

In the upcoming Stairway levels, I plan to go through the basics of this tool – how to get it, install it, and understand how the library works. We will learn how to write PowerShell scripts that can harness ScriptDOM's power and build linting rules.

What is ScriptDOM?

ScriptDOM is short for T-SQL Document Object Model.

ScriptDOM is a.NET framework library that comes with the Microsoft DacFx package and can be downloaded from here. All you need to work with it is this one .dll called Microsoft.SqlServer.TransactSql.ScriptDom.dll.

What can ScriptDOM do?

ScriptDOM is like a swiss-army knife and has many uses. Its main uses are:

  1. Parse T-SQL code for syntax errors.
  2. Find patterns in T-SQL code.
  3. Format T-SQL code.
  4. Fix some findings.

ScriptDOM has been around for a long time, since SQL Server 2008. Starting with SQL Server 2012, it was shipped with SQL Server and is part of the DacFx package. The DacFx package can be conveniently installed on a laptop, and it does not need any active SQL Server installation.

Why Bother Learning ScriptDOM?

We will look at a few practical examples of where ScriptDOM is useful in everyday work.

Searching Code

We will start with an example of a traditional search. Let’s say I am tasked with finding NOLOCK hints in stored procedures. I have this code and want to know where a NOLOCK hint is being used.

USE [AdventureWorks2019];
GO
SET ANSI_NULLS ON;
GO
SET QUOTED_IDENTIFIER ON;
GO
CREATE OR ALTER PROCEDURE [dbo].[uspGetEmployeeManagers] @BusinessEntityID [INT]
AS
--Testing nolock
BEGIN
  SET NOCOUNT ON;
  -- Use recursive query to list out all Employees required for a particular Manager
  WITH [EMP_cte] ([BusinessEntityID], [OrganizationNode], [FirstName], [LastName], [JobTitle], [RecursionLevel]) -- CTE name and columns
  AS ( SELECT
                    e.[BusinessEntityID]
                  , e.[OrganizationNode]
                  , p.[FirstName]
                  , p.[LastName]
                  , e.[JobTitle]
                  , 0 -- Get the initial Employee
       FROM
                    [HumanResources].[Employee] e
         INNER JOIN [Person].[Person] (NOLOCK) AS p
           ON p.[BusinessEntityID] = e.[BusinessEntityID]
       WHERE        e.[BusinessEntityID] = @BusinessEntityID
       UNION ALL
       SELECT
                    e.[BusinessEntityID]
                  , e.[OrganizationNode]
                  , p.[FirstName]
                  , p.[LastName]
                  , e.[JobTitle]
                  , [RecursionLevel] + 1 -- Join recursive member to anchor
       FROM
                    [HumanResources].[Employee] e
         INNER JOIN [EMP_cte]
           ON e.[OrganizationNode] = [EMP_cte].[OrganizationNode].GetAncestor (1)
         INNER JOIN [Person].[Person] p
           ON p.[BusinessEntityID] = e.[BusinessEntityID])
  -- Join back to Employee to return the manager name
  SELECT
               [EMP_cte].[RecursionLevel]
             , [EMP_cte].[BusinessEntityID]
             , [EMP_cte].[FirstName]
             , [EMP_cte].[LastName]
             , [EMP_cte].[OrganizationNode].ToString () AS [OrganizationNode]
             , p.[FirstName] AS 'ManagerFirstName'
             , p.[LastName] AS 'ManagerLastName' -- Outer select from the CTE
  FROM
               [EMP_cte]
    INNER JOIN [HumanResources].[Employee] e
      ON [EMP_cte].[OrganizationNode].GetAncestor (1) = e.[OrganizationNode]
    INNER JOIN [Person].[Person] p (NOLOCK)
      ON p.[BusinessEntityID]                         = e.[BusinessEntityID]
  ORDER BY
               [RecursionLevel]
             , [EMP_cte].[OrganizationNode].ToString ()
  OPTION (MAXRECURSION 25);
END;
GO

What I would typically do is use T-SQL to search the code, as shown below:

SELECT
      OBJECT_NAME (OBJECT_ID) SP_Name
    , OBJECT_DEFINITION (OBJECT_ID) SP_Definition
FROM  sys.procedures
WHERE OBJECT_DEFINITION (OBJECT_ID) LIKE '%nolock%';

I get the stored procedure name and text in which the NOLOCK hint is found from the DMV query (shown in this image):

I do not get any information on which line the NOLOCK hint is on, and if I had any comment text that contained NOLOCK, that would be part of the results.

With ScriptDOM, what I find will be devoid of false positives like NOLOCK embedded in any comments. I will also get the exact location of the NOLOCK, with the line, column, and offset in the script.

Parsing Scripts

If I have a large number of scripts that need to be validated, the easiest way would be to try and run them and find out that there are errors. I could also open them manually in SSMS and then try to parse them there. Either way involves connecting to an active SQL Server and causing some overhead. Also, some scripts could be from older versions of SQL Server and there may not even be an install of the older version available.

With ScriptDOM I can get scripts verified for syntax errors on any version with 4 lines of PowerShell code. There is no need for any SQL Server connection.

Finding Nuanced Patterns

Let us say I need to find all occurrences of  SELECT * occurrence in stored procedures. I would have to use a query like the one shown below.

USE WIDEWORLDIMPORTERS;
GO
SELECT
      OBJECT_NAME (OBJECT_ID) SP_Name
    , OBJECT_DEFINITION (OBJECT_ID) SP_Definition
FROM  sys.procedures
WHERE OBJECT_DEFINITION (OBJECT_ID) LIKE '%SELECT *%';

This query again will return results like below, with the stored procedure name and code in which it found the “SELECT *” text.

I have a need though to exclude certain situations of usage of SELECT *, which are actually legitimate, such as:

  • In the If…EXISTS clause
  • When inserting into temp tables/table variables
  • When selecting from temp tables or table variables

Finding specific instances of SELECT * that exclude above scenarios are hard to do with the DMV search. It is possible to write regex queries, but those are time-consuming and difficult.

With ScriptDOM specific situations can be found with some basic programming effort using the DLL and a good understanding of .NET programming.

Fixing a Bad Pattern

Consider a scenario where NOLOCK hints need to be removed from a large codebase. There are two ways to accomplish this:

  • Use a text replace by opening the file(s) on SSMS and rerunning the script on the server(s). This could be semi-automated using techniques such as those described here.
  • Regex can be used with the search and replace in SSMS, as illustrated in the article here.

Both methods have the following issues:

  • The text replace can find a match that is not necessarily in code and replace it
  • Regex may be more accurate, but is fairly complex to write
  • To check the output of either of the methods above, you need to be connected to an active SQL Server. If there are a large number of scripts, this can be a cumbersome and time-consuming exercise.

Formatting Large Volumes of Code

Code that is poorly formatted is difficult to read and maintain. There are many tools available that can help format code as one is writing it, including one of my favorite tools, SQL Prompt. But large volumes of unformatted code are difficult to fix. They can require a lot of manual work, and it is unlikely that a person would be able to devote sufficient time to re-formatting all the code.

ScriptDOM comes with formatting capabilities that can be tuned according to a user’s preferences.

Finding Issues with Scripts Pre-rollout

Some settings, such as data compression on tables/indexes, are best found before the code runs on production. Ideally, if these issues are found during the build and cause a failure, that helps the developer fix them before they are deployed on live systems and cause issues.

Below is a simple script that uses ScriptDOM to find NOLOCK references.

#Find no lock hint in views/stored procs/functions
function Find-NoLockHintWithPattern
{
    [CmdletBinding()]
    param(
           $tsqlfragmentforrule
    )
    Try
    {
         class VisitorTableHintRef: Microsoft.SqlServer.TransactSql.ScriptDom.TSqlFragmentVisitor {
        
         [void]Visit ([Microsoft.SqlServer.TransactSql.ScriptDom.TableHint] $fragment) {
            
            $tablehint = $fragment
            if ($tablehint.HintKind -eq 'Nolock')
            {
                $errorline = "WARNING: NOLOCK hint found at "
                write-host $errorline $fragment.StartLine ":" $fragment.StartColumn ":" $fragment.FragmentLength -BackgroundColor Red

            }
        }
    
     }
     
        $visitortablehintref = [VisitorTableHintRef]::new()
        $tSqlFragmentforrule.Accept($visitortablehintref)
        
} 
catch {
    throw
}
} 

Below is a shell script that calls the above script.

function Find-LintingErrors
{
    
        #selectstatement contains the statement to be examined for existence of select *
    [CmdletBinding()]
    param(
          [string] $SelectStatement
    )
    
    try {
    Add-Type -Path "C:\ugpresentation\ScriptDom\Microsoft.SqlServer.TransactSql.ScriptDom.dll"
    
    $DDLParser = New-Object Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser($true)
    $DDLparserErrors = New-Object System.Collections.Generic.List[Microsoft.SqlServer.TransactSql.ScriptDom.ParseError]
    # create a StringReader for the script for parsing
    $stringReader = New-Object System.IO.StringReader($selectstatement)
    # parse the script
    $tSqlFragment = $DDLParser.Parse($stringReader, [ref]$DDLParsererrors)
    # raise an exception if any parsing errors occur
    if($DDLParsererrors.Count -gt 0) {
        throw "$($DDLParsererrors.Count) parsing error(s): $(($DDLParsererrors | ConvertTo-Json))"
    }
   Find-NoLockHintWithPattern $tSqlFragment
   
}
catch {
    throw
}
}

I call the shell script and pass in code from a stored procedure in WideWorldImporters. This procedure has two NOLOCK hints and one comment stating that a NOLOCK hint was added.

$ScriptData = @"USE [WideWorldImporters]
GO
/****** Object:  StoredProcedure [Integration].[GetOrderUpdates]    Script Date: 2/27/2022 4:37:48 PM ******/SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE OR ALTER PROCEDURE [Integration].[GetOrderUpdates]
@LastCutoff datetime2(7),
@NewCutoff datetime2(7)
WITH EXECUTE AS OWNER
AS
BEGIN
    SET NOCOUNT ON;
    SET XACT_ABORT ON;
--Adding nolock hints in code
    SELECT CAST(o.OrderDate AS date) AS [Order Date Key],
           CAST(ol.PickingCompletedWhen AS date) AS [Picked Date Key],
           o.OrderID AS [WWI Order ID],
           o.BackorderOrderID AS [WWI Backorder ID],
           ol.[Description],
           pt.PackageTypeName AS Package,
           ol.Quantity AS Quantity,
           ol.UnitPrice AS [Unit Price],
           ol.TaxRate AS [Tax Rate],
           ROUND(ol.Quantity * ol.UnitPrice, 2) AS [Total Excluding Tax],
           ROUND(ol.Quantity * ol.UnitPrice * ol.TaxRate / 100.0, 2) AS [Tax Amount],
           ROUND(ol.Quantity * ol.UnitPrice, 2) + ROUND(ol.Quantity * ol.UnitPrice * ol.TaxRate / 100.0, 2) AS [Total Including Tax],
           c.DeliveryCityID AS [WWI City ID],
           c.CustomerID AS [WWI Customer ID],
           ol.StockItemID AS [WWI Stock Item ID],
           o.SalespersonPersonID AS [WWI Salesperson ID],
           o.PickedByPersonID AS [WWI Picker ID],
           CASE WHEN ol.LastEditedWhen > o.LastEditedWhen THEN ol.LastEditedWhen ELSE o.LastEditedWhen END AS [Last Modified When]
    FROM Sales.Orders (nolock) AS o
    INNER JOIN Sales.OrderLines AS ol
    ON o.OrderID = ol.OrderID
    INNER JOIN Warehouse.PackageTypes AS pt
    ON ol.PackageTypeID = pt.PackageTypeID
    INNER JOIN Sales.Customers (nolock) AS c
    ON c.CustomerID = o.CustomerID
    WHERE CASE WHEN ol.LastEditedWhen > o.LastEditedWhen THEN ol.LastEditedWhen ELSE o.LastEditedWhen END > @LastCutoff
    AND CASE WHEN ol.LastEditedWhen > o.LastEditedWhen THEN ol.LastEditedWhen ELSE o.LastEditedWhen END <= @NewCutoff
    ORDER BY o.OrderID;
    RETURN 0;
END;
GO
"@
Find-LintingErrors $scriptdata

The results I get are as shown in the image below. It may be noteworthy that the reference of NOLOCK in a comment was not included in the results. The results also contain the exact location of the hint (row:column:length).

Summary

To summarize, the main scenarios you may discover ScriptDOM a better choice over text searches/regex/various other methods

  1. Large codebase
  2. Need to find issues pre rollout
  3. Nuanced search conditions
  4. Findings that are accurate and provide you with the exact location of the pattern.
  5. Do all of this asynchronously - without connecting to a SQL Server.

In the following levels I will show how each of these can be accomplished using ScriptDOM and PowerShell.

Rate

4.75 (8)

Share

Share

Rate

4.75 (8)