SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Inserting Markup into a String with SQL

 There are a number of occasions when one might want to insert information at various places in a string. Where one has a template for reporting or emailing, for example; or maybe creating an HTML-formatted result.  Here is a method that I've used a fair amount, and which I'm refining for the next version of the prettifier that I'm developing.  It is based on the wonderful STUFF function, but allows any number of 'simultaneous' insertions into a string without the pain and slowness of multiple string concatenations. Let's take a slightly artificial example where you decide to render a string in several colours. We have the information as to the location of the start and end of each coloured part of the string.  It is all terribly easy. (please remember that this is just an example!)

DECLARE @MarkedUpString VARCHAR(255)
DECLARE @markup TABLE (start INT PRIMARY KEY, [end] INT, color VARCHAR(10))
INSERT INTO @markup (start,[end],color) SELECT  1, 5,'red'
INSERT INTO @markup (start,[end],color) SELECT  9, 10,'green'
INSERT INTO @markup (start,[end],color) SELECT  11, 17,'fuscia'
INSERT INTO @markup (start,[end],color) SELECT  36, 44,'blue'

SELECT @markedupString='This is a string which needs to be coloured'
SELECT @markedUpString=
STUFF(@MarkedUpString+'  ',[end],0,'</span>'),
[start],0,'<span style="color:'+color+';">'
FROM @markup ORDER BY start DESC
<span style="color:red;">This</span> is <span style="color:green;">a</span> <span style="color:fuscia;">string</span> which needs to be <span style="color:blue;">coloured</span>        

 so there you have it. Because there is no overlap, and we do the replacements from the end of the string to the beginning, it all works out, and we can use the initial references without them getting spoiled. The reason I've got so interested in this technique is because any string manipulation with a long string takes a big performance hit. This is, I think, due to the way that the NET framework handles strings. If you are handling big strings say 32K and 40K, or larger, it will pay you to use an approach which cuts down on unnecessary string manipulation, and you can thereby save a lot of time. My advice would be to feel free with small strings below 32K in SQL Server, but avoid too much processing on any string variable larger where possible.



Posted by Steve Jones on 13 February 2009

Very nice technique. I'd be curious to have someone run some performance analysis on a couple techniques and see which one works better.

Posted by Phil Factor on 13 February 2009

That was going to be a subsequent blog! There is something very interesting going on with the memory management for strings. It seems to affect XML performance too.

Posted by Phil Factor on 16 February 2009

OK: I've done some testing in a separate blog. It looks as if string concatenation is blindingly fast for small strings but that there is a second order logarithmic trend that means that it slows down progressively as the strings get longer.


Be Careful with String Concatenations in SQL Server with Big Strings. here in SSC blogs

Posted by jcrawf02 on 18 February 2009

Phil, so I realize this is an example, and I'm trying to figure out how I'd use it to my advantage in real life. And I'm slow.

Would I use patindex()/charindex() to populate the @markup table first? And then execute as above, which would require a different run through for every different tag?

Or do I somehow tie this in with a replacement table that will swap things in one shot, e.g. table cells for every tab character/HTML special characters when encountered, etc?

Posted by Phil Factor on 18 February 2009

Yes. but it wouldn't need a separate pass for every tag. You establish the 'transition points' in one table using patindex/Charindex, and use that table to insert the markup all in one SQL Statement. With the new version of the prettifier, I do one pass through the string to get all the transition ponts, update that table to work out  the extent of strings, comments, 'escaped' keywords and the like, and then use the resulting table to inset the markup all in one SQL Statement.

Posted by peter on 19 February 2009

Phil Factor,

You want to avoid the pseudo cursor here and take a different take on the problem. You already identified the points where you want to insert strings, from there on do this:

1. Build a seperate list of strings from the orginal string, using the already established positions withing the original string.

2. Use the "select .... for xml path( '' )" trick to make the final result very much like a zipper works fusing the two lists in the correct order.

This can be performed fully set based without no need for intermediate tables for the strongest effect.

Leave a Comment

Please register or log in to leave a comment.