Beware the REPLACE function when doing heavy string parsing

I found something interesting the other day. I was attempting to optimize a very heavy string parsing routine using T-SQL and was having problems. I don’t normally use T-SQL for such heavy string parsing, but this was a special case of a legacy structure that I had to work with, so I had no choice. Anyway, the said routine was a table function that basically takes a fixed length text file and turns it into a table (please don’t ask why) so you can join it to other tables in a database. The thing that made this interesting is that this fixed length text file had over 4,000 rows, so in order to turn this into a table the function has to slice and dice these 4,000 rows every time it’s executed, which it does by executing a WHILE loop.

When I finished this function, it ran in about 15 seconds, which was way too slow for our needs. Unbeknownst to me, one of my colleagues was attempting the same thing and wrote his code in a slightly different way, but he got it to run in about 1 second. When I analyzed his code, I found it to be virtually identical to mine, with one difference: during every iteration I was using the REPLACE function to ‘wittle’ down the string until none of the string was left, at which point the WHILE loop would exit and the function would finish executing. In contrast, his approach did not have any REPLACE function, he was simply navigating forward in the string until he reached the end.

At this point one of my very special mental “talents” kicked in: when my mind is stuck on a problem and someone ELSE tells me the solution to it my mind immediately says “oh, I knew that!” and proceeds to give me a detailed explanation of the solution and why it works. And leaves you wondering where all that great information was hiding 5 seconds ago when you didn’t have the answer and you really needed it….you have to wonder why such useless “talents” even exist….

So, of course, once I saw this then everything became obvious: my version of the function was doing over 4,000 REPLACE operations and his wasn’t. To add to this, I have to wonder if the string datatypes, such as the VARCHAR datatype that we were both using, are implemented as immutable data types in SQL Server. Does anyone know if this is the case? If it is then it would further explain the difference in speed. Since an immutable data type doesn’t really ‘change’ the instance of the string in memory and just adds a new instance of a string, this means that my function was creating over 4,000 instances in memory of the large string that I was parsing. Talk about a waste!

I mention this because lots of SQL developers may not have much experience in string parsing, since this is normally not done using SQL. This may be old hat to many application programmers, but it’s not something that DBAs run into as frequently, so I’m hoping that others will find it as useful as I did.

So, the moral of the story is: be careful when using REPLACE in heavy string parsing operations. Like many other things, if you need it you should use it, but the trick is in knowing when you REALLY need it. In my case I thought I needed it but in reality there was a much simpler solution available, and it ended up running 15 times faster. It proves the old adage of “Keep It Simple, Stupid”…given 2 equivalent solutions the simpler one is better.

…and that’s my $.02 for today….

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

Beware the REPLACE function when doing heavy string parsing

Rate

Share

Share

Rate

Beware the REPLACE function when doing heavy string parsing

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts