• Jeff Moden (10/1/2012)


    Eugene Elutin (10/1/2012)


    A couple of us just went through all of this on another thread a couple of months ago and the LIKE expressions blew the doors off the RegEx CLR.

    Could anyone post a link, please.

    Took me a bit to remember that the discussion was on an article by David Poole because it was Solomon Rutzky's code that we were discussing. Here's the link...

    http://www.sqlservercentral.com/Forums/Topic1296195-60-1.aspx

    Thank you Jeff! I've read through thread and original article, and found what I've expected.

    This sort of using RegEx (or any other .NET lib) in CLR I call "CLR wrapper". And you are right: this implementation most likely will loose to T-SQL one in terms of performance. And there is good reson for this: It is because that kind of implementation is too generic and not optimised for performance.

    The main reason for it to be quite slow and to require higher memory consumption is that every time such CLR function is called, underlying object needs to be instantiated in .NET framework and regular expression pattern re-compiled!

    I don't usually do this (and not only with RegEx's) for the above particular reason.

    Now, if you would like to design process/function optimised for performance you may need to trade-off its "generic" nature.

    If you design your CLR function to work for the specific pattern (and it was couple discussions recently about how to remove every occurrence of GUID string in any text or strip some "patterns" from beginning and end of strings), you can create RegEx based CLR, which will be quite hard (if possible) to outperform with T-SQL.

    That is all to do with the way you code CLR with RegEx, there are two simple tips:

    1. Create static RegEx object with compiled pattern!

    2. Don't error handle in CLR

    There are another aspects to mention here:

    It will also depend on the string sizes: CLR will win over T-SQL even with bigger margin

    In case of replacement required, than more occurrences of string to replace, than again such CLR will win over T-SQL with bigger margin.

    But, as I've stated, you will need to have dedicated CLR for a pattern. Yes, it's not too flexible, but it gives you much, much better performance...

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]