Jeff Moden (8/26/2016)
Matt Miller (#4) (8/25/2016)
I'm sorry if I am taking the fun out of building a SQL only solution, but there are lots of basic samples of SQLCLR showing how to expose the Regex.Matches() method from .NET CLR.A good evrsion is located here courtesy of Phil Factor:
https://www.simple-talk.com/sql/t-sql-programming/clr-assembly-regex-functions-for-sql-server-by-example/[/url]
Hey there ol' friend... you and I went through something similar years ago where regex didn't do so well performance-wise. Have you tried the performance for this solution?
Actually I just did. It's actually impressive, and having revalidated against the CLR solution, at least from my test - Alan's solution does beat CLR. But the interesting thing is that while running it on 2014, on large datasets (20M rows, 2-3k strings with avg 7 matches), on my machine - they are pretty darned close (total run took 8:30):
Ngrams8k came in at 4:08, CLR @ 4:22.
Which was about the same gap got on "small" datasets (100K rows, 250 char length, 3-4 matches). About 4 secs for Ngrams, 19 secs on CLR.
So - NGRAMS8k clearly does seem to perform a bit better, just not the blowout I was half expecting to see. Still it was the first time I browsed through Alan's article on it, so will kick the tires a bit more.
Thanks for the heads up.
----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?