The Helper Table Workbench

Cursors and iterations are both renowned for slowing down Transact SQL code, but sometimes seem unavoidable. In this workbench, Robyn Page and Phil Factor demonstrate some set-based techniques for string manipulation and time interval-based reporting, which use helper tables rather than the dreaded cursor.

Sometimes, when writing TSQL code in functions or procedures, it is tempting to do iterations, or even worse, a cursor, when it isn’t really necessary. Cursors and iterations are both renowned for slowing down Transact SQL code. SQL Server just isn’t designed for it.

However, there is usually a way to do such operations in a set-based way. If you do so, then your routines will run a lot faster, with speed at least doubling. There are a lot of tricks to turning a problem that seems to require an iterative approach into a set-based operation, and we wish we could claim we’d invented one of them. Probably the most useful technique involves that apparently useless entity, the ‘helper’ table. This workshop will concentrate on this, because it is probably the most widely used.

The most common Helper table you’ll see is a table with nothing but the numbers in a sequence from 1 upwards. These tables have a surprising number of uses. Once you’ve understood the principles behind helper tables, then you’ll think of many more. We’ll be providing several examples where a helper table suddenly makes life easier. The objective is to show the principles so that you’ll try out something similar the next time you have to tackle a tricky operation in TSQL. As always, you’re encouraged to load the example script into Query Analyser or Management Studio, and experiment!

Our examples include:

Splitting Strings into table-rows, based on a specified delimiter
Encoding and decoding a string
Substituting values into a string
Extracting individual words from a string into a table
Extracting all the numbers in a string into a table
Removing all text between delimiters
Scrabble score
Moving averages
Getting the ‘Week beginning’ date in a table
Calculating the number of working days between dates.

Note. These examples use VARCHAR(8000) just so they compile on both SQL Server 2000 and 2005. If you are using SQL Server 2005, you’ll probably want to change them to VARCHAR(MAX).

Before we start, we’ll need a helper table of numbers. Our examples aren’t going to require high numbers, but we’ve parameterised the size of the table that the routine creates

Creating the helper table

Here is a routine that checks to see if such a ‘helper’ table called numbers exists, and, if not, creates it.

Once you have one of these tables, which we’ve seen described as the Transact SQL developer’s ‘Swiss Army Knife’, you will not want to be without it.

Splitting Strings into table-rows, based on a specified delimiter

Imagine you have a string which you want to break into words to make into a table.

This can be done very simply and quickly through the following single SQL Select statement:

(Make sure you have executed the spMaybeBuildNumberTable procedure first!)

You can then enshrine this principle into a table function that will take any delimiter to split a string. (We believe that the credit for this clever routine should go to Anith Sen).

This is the fastest means I have come across in TSQL to split a string into its components as rows. Try this, which give you a table with the integer and the nominal name.

Encoding and decoding a string

You can use the same principle for a lot of string operations.

Here is one that will URLencode a string so it can be used in a POST or GET HTTP operation. The string is converted into a table with one character per row and, after being operated on, it is re-assembled.

This sort of routine is a lot less complex than the iterative methods and is, as one would expect, a lot faster. Just to show that this is no fluke, here is the reverse function to decode a URL Query string.

Here is a simple function that acts as an int to hex converter.

It works this way:

And what about TSQL that converts hex to int?

And now we can test it out:

Substituting values into a string

Next, we have a function that uses these principles to do macro substitution of values into a string. It will work for replacing XHTML placeholders with a value, or producing error messages in a variety of languages. In fact, uses keep popping up for this sort of function. In this version, one can specify what strings are used for substitutions (the default is %1, %2, %3 etc,) and what you use as the delimiter of your list of macros and values.

There are several ways that we can use this routine in practical applications. Try out these and see what happens!

Extracting individual words from a string into a table

One can do rather cleverer things than this. For example, one can extract all the words from a string into a table, a row for each
word.

Extracting all the numbers in a string into a table

Even more useful than this is a function that picks out all the numbers from a string into a table. You can therefore easily pick out the third or fourth string simply, because the table has the order as well as the number itself. Were it not for the unary minus operator, this would have been a delightfully simple function.

If you are using this routine, you’ll want to cast these numbers into the number type of your choice. We supply them as strings.

Removing all text between delimiters

This is a handy little routine for looking at the strings in HTML code, but seems to earn its keep in a lot of other ways. You specify the opening and closing delimiter. At the moment, only single-character delimiters are allowed. Can anyone re-write it to allow multi-character delimiters?.

So we can try it out with brackets:

Or if you want to take out tags:

Scrabble Score

And as a slightly silly example of the sort of chore that crops up occasionally when analysing strings, character by character, here is a way of scoring a string for Scrabble, assuming it is all on ordinary squares!

And now we try it out!

Now we find out which are the highest scorers, assuming an illegal quantity of tiles. We use the WordList from Phil’s Blog on the Fireside Fun of Decapitations.

Don’t try running this without the WordList!

As well as slicing and dicing strings, once one has one’s helper table,
suddenly time-interval based reporting becomes much easier.

Moving averages

How about moving averages? Here is a simple select statement that gives you the moving average (over a week) of the number of log entries for every day for a year. This can be adapted to give weighted and exponential moving averages over arbitrary time periods. You use this technique for ironing out ‘noise’ from a graph in order to accentuate the underlying trend.

To execute this, you will need a table to try it on.

Now we can try out a moving average!

Getting the ‘Week beginning’ date in a table

Here is a UDF that lists all the Mondays (or whatever you want) between two dates.

Number of Working Days between two dates

Or, how about a UDF that tells you the number of working days between two dates? (you can alter it if Saturday and Sunday are not your days off!)

So how many working days until Christmas?

We can go on for ever with example of using a numeric Helper table but we won’t because most of you will have wandered off even before you reach this point. We hope that you’ll now take over and create some more examples, try them out against iterative solutions that do the same thing. We guarantee you’ll be pleased with the result!

See also other Workbenches at Simple-Talk

Robyn Page’s SQL Server DATE/TIME Workbench, Robyn Page
Date calculation and formatting in SQL Server can be surprisingly tricky. Robyn Page’s “hands-on” workbench will lead you through the minefield.

Robyn Page’s SQL Server String Manipulation Workbench, Robyn Page
String searching and manipulation in SQL Server can be error-prone and tedious… Unless you’re armed with the techniques described in Robyn’s string manipulation workbench…

SQL Server Error Handling Workbench, Grant Fritchey
Grant Fritchey steps into the workbench arena, with an example-fuelled examination of catching and gracefully handling errors in SQL 2000 and 2005, including worked examples of the new TRY..CATCH capabilities.

Robyn Page’s SQL Server Cursor Workbench, Robyn Page
The topic of cursors is the ultimate “hot potato” in the world of SQL Server. Everyone has a view on when they should and mainly should not be used. By example and testing Robyn Page proves that, when handled with care, cursors are not necessarily a “bad thing”.

Robyn Page’s SQL Server Data Validation Workbench, Robyn Page
Robyn Page provides essential techniques for ensuring the validity of the data being entered into your SQL Server tables.

Robyn Page’s Excel Workbench, Robyn Page and Phil Factor
The need to produce Excel reports from SQL Server is very common. Here, Robyn Page and Phil Factor present practical techniques for creating and manipulating Excel spreadsheets from SQL Server, using linked servers and T-SQL. The pièce de résistance is a stored procedure that uses OLE Automation…

Robyn Page’s SQL Server Security Workbench, Robyn Page and Phil Factor
Robyn Page and Phil Factor present practical T-SQL techniques for controlling access to sensitive information within the database, and preventing malicious SQL injection attacks.