Home Forums SQL Server 2012 SQL 2012 - General SQL help - how do we find that a particular string appears x times in a certain field RE: SQL help - how do we find that a particular string appears x times in a certain field

  • Jeff Moden (8/26/2016)


    Matt Miller (#4) (8/25/2016)


    I'm sorry if I am taking the fun out of building a SQL only solution, but there are lots of basic samples of SQLCLR showing how to expose the Regex.Matches() method from .NET CLR.

    A good evrsion is located here courtesy of Phil Factor:

    https://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/[/url]

    Hey there ol' friend... you and I went through something similar years ago where regex didn't do so well performance-wise. Have you tried the performance for this solution?

    Actually I just did. It's actually impressive, and having revalidated against the CLR solution, at least from my test - Alan's solution does beat CLR. But the interesting thing is that while running it on 2014, on large datasets (20M rows, 2-3k strings with avg 7 matches), on my machine - they are pretty darned close (total run took 8:30):

    Ngrams8k came in at 4:08, CLR @ 4:22.

    Which was about the same gap got on "small" datasets (100K rows, 250 char length, 3-4 matches). About 4 secs for Ngrams, 19 secs on CLR.

    So - NGRAMS8k clearly does seem to perform a bit better, just not the blowout I was half expecting to see. Still it was the first time I browsed through Alan's article on it, so will kick the tires a bit more.

    Thanks for the heads up.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?