RE: Remove duplicate substrings from string

SSC Guru

Points: 182882

April 24, 2018 at 10:23 am

#1988072

Thom A - Tuesday, April 24, 2018 9:03 AM
Interesting idea Eirikur. If I read your code correctly, however, this does assume that the value between the the Greater Than symbols is only 1 character.

You are correct Tom, the sample data only indicates one space character between the GT's and the values.
😎

If the sample data does not accurately reflect the characteristics of the real data, then this can easily be adjusted, I'm not in the business of assuming or guessing😉

In this case, the two major factors, when it comes to performance, are the iterations in the delimiter detection part and the iterations in the reconstruction of the output string. Since the sample data indicates that this is a fixed width entry, we can reduce the number of iterations in the former by the length of each entry, here that is 1/4.

Another factors are the plan simplification, the reduction of blocking (sort) operators from 7 to 1, on a large sets, this makes a lot of difference, the number of joins from 11 to 9 and the reduction of the total number of operators from approx. 67 to 48.

Although the batch execution in the execution plan is not always too accurate, the difference here is 9:1 on my old laptop.