SQLServerCentral Article

Formatting T-SQL Scripts using ScriptDOM and PowerShell

,

Well-formatted scripts are always easier to maintain and enhance readability to high degree. Formatting standards are an important part of coding standards anywhere and if not, they need to be. If there are a large number of scripts this can add up to a significant amount of time needed to maintain them. It is also not something that can be done entirely for the sake of cosmetic value. Even if one chooses to format lots of scripts, it can take considerable time and patience. Having a tool that can format scripts without altering any other content can be really useful. There are some third-party tools for T-SQL that do this, such as the formatting option provided by RedGate’s SQL Prompt.

Introducing ScriptDOM

The T-SQL Script Document Object Model, also known as ScriptDOM, provides a way of formatting code as well. ScriptDOM is a .NET framework-based library. ScriptDOM can be used by any .NET based language, VB.net, C#, and of course, PowerShell (>= 5.00). ScriptDOM is a swiss-army-knife kind of a tool; it can format, lint, and parse T-SQL code. ScriptDOM is very sparsely documented, and many of these capabilities have to be discovered by experimenting with the library and understanding it.

The library is named Microsoft.SQLServer.TransactSQL.ScriptDOM.dll and can be downloaded and installed along with Microsoft DacFx package from here. There are separate versions available for the .NET framework and .NET core. This example uses the .NET framework version.

Formatting Options

There are options provided in the library to configure how you want the formatting to look. Those are listed below.

  • Include semi colons at end of every statement
  • Align body inside of blocks
  • Aligns all column definitions for create view/table
  • Aligns set statements
  • 'AS' will be on its own line
  • Define indentation - only spaces
  • Indent set clauses
  • Indent body of view
  • Set keyword casing
  • Separate each column on insert source statement to its own line
  • Separate each column on insert target statement to its own line
  • Separate each column on select statement to its own line
  • Separate each item on set clause to its own line
  • Separate each column on view to its own line
  • Separate each line on where predicate to its own line
  • Insert a new line before ( on multi line list of columns
  • Insert a new line before from clause
  • Insert a new line before group by clause
  • Insert a new line before having clause
  • Insert a new line before join
  • Insert a new line before offset clause
  • Insert a new line before ) on multi line list of columns
  • Insert a new line before order by
  • Insert a new line before output clause
  • Insert a new line before where clause
  • Recognize syntax specific to engine type - to be safe use 0
  • 0 All 1 Engine 2 Azure
  • Version targeted:
    • 100 2
    • 110 3
    • 120 4
    • 130 5
    • 140 6
    • 150 7
    • 80     1
    • 90     0 (default)

Most of these are standard and defaults are very safe to use. The code has all of them specified in case anyone wants a different option from default. It is important, though, to pick the right SQL version. The default is 90, SQL Server 2005.

Options not available

There are a few options that are not available, but which would be nice. These are

  • using tabs to indent - currently only spaces are allowed, though it is possible to set the number of spaces
  • specify if commas come before or after column names
  • square brackets allowed or disallowed

Anatomy of usage

The first step involves connecting to the dll and setting some options. To do this, it is important to determine where the dll resides on the machine that runs the script. In most cases it will be in path below, but this may be different depending on your version of the library. The path is used in the Add-Type command.

The next step involves creating an object from the SQL150ScriptGenerator class. (Any version of SQL Server going back to 90 may be used. This example uses SQL150). The options for formatting are properties on this object that are set after creating it, as shown below. '$generator.Options.IncludesemiColons' means semicolons need to be included after each clause in the script. '$generator.Options.AlignClauseBodies' means the body of clauses will be left aligned. There is no need to specify each of these options if the defaults are sufficient. In below script options are specified for user to understand what they are and set to default.

The script is parsed using the Parse() method to make sure there are no errors. The Parse() method also generates the high level T-SQL fragment object containing all the fragments in the script. This is what will be used by the GenerateScript() method to format the code. This code checks for errors and lets the caller know how many might be in the code.

Below are a few examples of usage of the Posh cmdlet.

Formatting a Table Script

In this example the script file createtable.sql has a  'CREATE TABLE' script, shown below, where the column 'name' is not aligned with the body of the script. The keyword 'smallint' and 'not null' are in lower case.  The output (displayed on console and saved to the specified output file) has these issues corrected.

CREATE TABLE [HumanResources].[Department](
       [DepartmentID] [smallint] IDENTITY(1,1) NOT NULL,
[Name] [dbo].[Name] not null,
       [GroupName] [dbo].[Name] NOT NULL,
       [ModifiedDate] [datetime] NOT NULL,
 CONSTRAINT [PK_Department_DepartmentID] PRIMARY KEY CLUSTERED ([DepartmentID] ASC
)WITH (PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]

Here is the code to format the script.

Format-Tsql -InputScript c:\scriptdom\createtable.sql -OutputScript c:\scriptdom\formattedcreatetable.sql

Here are the results. Note the changes in the script.

Formatting a View

Below is an example of a view that also has columns misaligned and join clauses not indented for easy reading. Both issues are fixed in the output, which is visible on the screen and saved to the script file specified.

CREATE OR ALTER VIEW [Sales].[vSalesPerson] as
SELECT
s.[BusinessEntityID]
,p.[Title]
,p.[FirstName]
FROM [SALES].[SalesPerson] s
inner join [HUMANRESOURCES].[Employee] e
ON e.[BusinessEntityID] = s.[BusinessEntityID]
INNER JOIN [Person].[Person] p
ON p.[BusinessEntityID] = s.[BusinessEntityID]
GO 

The formatting code:

Format-Tsql -InputScript c:\scriptdom\createview.sql -OutputScript c:\scriptdom\formattedcreateview.sql

The output:

3 Stored Procedure

Below is an example of a stored procedure script that has the query inside misaligned, reserved word in parameter 'nvarchar' is in lower case.  Both issues are fixed in the output,visible on the screen and saved to script file specified.

CREATE OR ALTER PROCEDURE [Website].[SearchForPeople] 
@SearchText nvarchar(1000), 
@MaximumRowsToReturn int AS 
BEGIN SELECT TOP(@MaximumRowsToReturn) p.PersonID, p.FullName, p.PreferredName, CASE WHEN p.IsSalesperson <> 0 THEN N'Salesperson' WHEN p.IsEmployee <> 0 THEN N'Employee' WHEN c.CustomerID IS NOT NULL THEN N'Customer' WHEN sp.SupplierID IS NOT NULL THEN N'Supplier' WHEN sa.SupplierID IS NOT NULL THEN N'Supplier' END AS Relationship, COALESCE(c.CustomerName, sp.SupplierName, sa.SupplierName, N'WWI') AS Company FROM [Application].People AS p LEFT OUTER JOIN Sales.Customers AS c ON c.PrimaryContactPersonID = p.PersonID LEFT OUTER JOIN Purchasing.Suppliers AS sp ON sp.PrimaryContactPersonID = p.PersonID LEFT OUTER JOIN Purchasing.Suppliers AS sa ON sa.AlternateContactPersonID = p.PersonID WHERE p.SearchName LIKE N'' + @SearchText + N'' order by p.FullName FOR JSON AUTO, ROOT(N'People'); END; GO
Format-Tsql -InputScript c:\scriptdom\createtable.sql -OutputScript c:\scriptdom\formattedcreatetable.sql

Pros and Cons

As with any code, there are pros and cons to using ScriptDOM.

Pros

Any call to ScriptDOM, including what is being demo-ed with formatting scripts, can be made asynchronously with no active connection to SQL Server. There is no SQL Server install needed on the machine running the script. This makes it very lightweight to run on code repositories, or other large volumes of code.

Cons

The formatting option of ScriptDOM strips code of comments, which is why it should be used with caution on existing object scripts. It is also lacking in certain formatting options that would be useful, such as a choice between tab and spaces for indenting, the position of commas as column delimiters, and optional square brackets around object names. ScriptDOM is also very sparsely documented. Most people have to learn how to use it by trial and error or by referring to the few blog posts available out there.

References

1 There is a C# version of formatter, written by Michael Swart(t|b) for removing comments from code. It achieves the same purpose and  it can be found here.

2 A great introduction to ScriptDOM can be found on this talk by Arvind Shyamsunder(t|b) on the PM Team. Arvind’s GitHub Repo also has several C# scripts using ScriptDOM for various purposes including an older version of the formatter.

Below is the code using ScriptDOM to format scripts and generate an output (formatted) file.

<#  
.SYNOPSIS  
    Format-TSQL will format tsql script supplied per options set
.DESCRIPTION  
    This script will strip supplied code of comments format per options
.NOTES  
    Author     : Mala Mahadevan (malathi.mahadevan@gmail.com)
  
.PARAMETERS
-InputScript: text file containing T-SQL
    -OutputScript: name of text file to be generated as output
.LIMITATIONS
Strips code of comments
.LINK  
    

.HISTORY
2021.08.08First version for sqlservercentral.com
#>
function Format-TSQL
{
   
    #Defining parameter for scriptname
    [CmdletBinding()]
    param(
           [System.IO.FileInfo[]]$InputScript,
           [String]$OutputScript
    )
    If ((Test-Path $InputScript -PathType Leaf) -eq $false)
    {
        $errormessage = "File $InputScript not found!"
        throw $errormessage
    }
    If ((Test-Path $OutputScript -IsValid) -eq $false)
    {
        $errormessage = "Path for $Outputscript not found!"
        throw $errormessage
    }
    #This may need to be modified to wherever the dll resides on your machine
    Add-Type -Path "C:\Program Files\Microsoft SQL Server\150\DAC\bin\Microsoft.SqlServer.TransactSql.ScriptDom.dll"
    $generator = [Microsoft.SqlServer.TransactSql.ScriptDom.Sql150ScriptGenerator]::New();
    #Include semi colons at end of every statement
    $generator.Options.IncludeSemicolons = $true
    #Aligns body inside of  blocks
    $generator.Options.AlignClauseBodies = $true
    #Aligns all column definitions for create view/table
    $generator.Options.AlignColumnDefinitionFields = $true
    #Aligns set statements
    $generator.Options.AlignSetClauseItem = $true
    #create or alter 'as' will be on its own line
    $generator.Options.AsKeywordOnOwnLine = $true
    #Define indentation - only spaces
    $generator.Options.IndentationSize = 10
    #Indent set clauses
    $generator.Options.IndentSetClause = $true
    #Indent body of view
    $generator.Options.IndentViewBody = $true
    #Set keyword casing
    $generator.Options.KeywordCasing =  1 #0 lower case 1 upper case 2 pascal case
    #Seperate each column on insert source statement to its own line
    $generator.Options.MultilineInsertSourcesList = $true
    #Seperate each column on insert target statement to its own line
    $generator.Options.MultilineInsertTargetsList = $true
    #Seperate each column on select statement to its own line
    $generator.Options.MultilineSelectElementsList = $true
    #Separate each item on set clause to its own line
    $generator.Options.MultilineSetClauseItems = $true
    #Separate each column on view to its own line
    $generator.Options.MultilineViewColumnsList = $true
    #Separate each line on where predicate to its own line
    $generator.Options.MultilineWherePredicatesList = $true
    #Insert a new line before ( on multi line list of columns
    $generator.Options.NewLineBeforeCloseParenthesisInMultilineList = $true
    #Insert a new line before from clause
    $generator.Options.NewLineBeforeFromClause = $true
    #Insert a new line before group by clause
    $generator.Options.NewLineBeforeGroupByClause = $true
    #Insert a new line before having clause
    $generator.Options.NewLineBeforeHavingClause = $true
    #Insert a new line before join
    $generator.Options.NewLineBeforeJoinClause = $true
    #Insert a new line before offset clause
    $generator.Options.NewLineBeforeOffsetClause = $true
    #Insert a new line before ) on multi line list of columns
    $generator.Options.NewLineBeforeOpenParenthesisInMultilineList = $true
    #Insert a new line before order by
    $generator.Options.NewLineBeforeOrderByClause = $true
    #Insert a new line before output clause
    $generator.Options.NewLineBeforeOutputClause = $true
    #Insert a new line before where clause
    $generator.Options.NewLineBeforeWhereClause = $true
    #Recognize syntax specific to engine type - to be safe use 0
    $generator.Options.SqlEngineType = 0 # 0 All 1 Engine 2 Azure
    #Version used 
    #1002
    #1103
    #1204
    #1305
    #1406 
    #1507
    #80    1
    #90    0 (default)
    $generator.Options.SqlVersion = 7
    #Read the string passed in
    $stringreader = New-Object -TypeName System.IO.StreamReader -ArgumentList $InputScript
   
    #Create generate object
    $generate =     [Microsoft.SqlServer.TransactSql.ScriptDom.Sql150ScriptGenerator]($generator)
    #Parse the string for errors and create tsqlfragment for formatting
    $parser = [Microsoft.SqlServer.TransactSql.ScriptDom.TSql150Parser]($true)::New();
    
    if($parser -eq $null){
    throw 'ScriptDOM not installed or not accessible'
    }
    $parseerrors = $null
    $fragment = $parser.Parse($stringreader,([ref]$parseerrors))
    # raise an exception if any parsing errors occur
    if($parseerrors.Count -gt 0) {
        throw "$($parseErrors.Count) parsing error(s): $(($parseErrors | ConvertTo-Json))"
    } 
    $formattedoutput = ''
    #Format the string
    $generate.GenerateScript($fragment,([ref]$formattedoutput)) 
    write-host $formattedoutput -BackgroundColor blue
    $formattedoutput.ToString() | Out-File $OutputScript
        
}

Rate

4.4 (5)

You rated this post out of 5. Change rating

Share

Share

Rate

4.4 (5)

You rated this post out of 5. Change rating