Blog Post

Parsing T-SQL – The easy way

,

UPDATE 2012-09-12 : For my latest adventures with TSQL Parsers please see this post

Every once in a while, I hit an issue that would require me to interrogate/parse some T-SQL code.  Normally, I would shy away from this and attempt to solve the problem in some other way.  I have written parsers before in the the past using LEX and YACC, and as much fun and awesomeness that path is,  I couldnt justify the time it would take.

However, this week I have been faced with just such an issue and at the back of my mind I can remember reading through the SQLServer 2012 feature pack and seeing something called “Microsoft SQL Server 2012 Transact-SQL Language Service “.  This is described there as :

“The SQL Server Transact-SQL Language Service is a component based on the .NET Framework which provides parsing validation and IntelliSense services for Transact-SQL for SQL Server 2012, SQL Server 2008 R2, and SQL Server 2008. “

Sounds just what I was after.  Documentation is very scant on this so dont take what follows as best practice or best use, just a practice and a use.

Knowing what I was sort of looking for something, I found the relevant assembly in the gac which is the simply named ,’Microsoft.SqlServer.Management.SqlParser’.

Even knowing that you wont find much in terms of documentation if you do a web-search, but you will find the MSDN documentation that list the members and methods etc…

The “scanner”  class sounded the most appropriate for my needs as that is described as “Scans Transact-SQL searching for individual units of code or tokens.”.

After a bit of poking, around the code i ended up with was something like

[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.Management.SqlParser") | Out-Null
$ParseOptions = New-Object Microsoft.SqlServer.Management.SqlParser.Parser.ParseOptions
$ParseOptions.BatchSeparator = 'GO'
$Parser = new-object Microsoft.SqlServer.Management.SqlParser.Parser.Scanner($ParseOptions)
$Sql = "Create Procedure MyProc as Select top(10) * from dbo.Table"
$Parser.SetSource($Sql,0)
$Token=[Microsoft.SqlServer.Management.SqlParser.Parser.Tokens]::TOKEN_SET
$Start =0
$End = 0
$State =0 
$IsEndOfBatch = $false
$IsMatched = $false
$IsExecAutoParamHelp = $false
while(($Token = $Parser.GetNext([ref]$State ,[ref]$Start, [ref]$End, [ref]$IsMatched, [ref]$IsExecAutoParamHelp ))-ne [Microsoft.SqlServer.Management.SqlParser.Parser.Tokens]::EOF) {
    try{
        ($TokenPrs =[Microsoft.SqlServer.Management.SqlParser.Parser.Tokens]$Token) | Out-Null
        $TokenPrs
        $Sql.Substring($Start,($end-$Start)+1)
    }catch{
        $TokenPrs = $null
    }    
}

As you can see , the $Sql variable holds the sql to be parsed , that is pushed into the $Parser object using SetSource,  and then we will use GetNext until the EOF token is returned.  GetNext will also return the Start and End character positions within the source string of the parsed text.

This script’s output is :

TOKEN_CREATE

Create

TOKEN_PROCEDURE

Procedure

TOKEN_ID

MyProc

TOKEN_AS

as

TOKEN_SELECT

Select

TOKEN_TOP

top

TOKEN_INTEGER

10

TOKEN_FROM

from

TOKEN_ID

dbo

TOKEN_TABLE

Table

note that the ‘(‘, ‘)’  and ‘*’ characters have returned a token type that is not present in the Microsoft.SqlServer.Management.SqlParser.Parser.Tokens Enum that has caused an error which has been caught in the catch block. 

Fun, Fun ,Fun , Simple T-SQL Parsing.  Hope this helps someone in the same position,  let me know how you get on.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating