SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


RegEx for DBAs


RegEx for DBAs

Author
Message
David.Poole
David.Poole
SSCertifiable
SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)

Group: General Forum Members
Points: 7865 Visits: 3290
Comments posted to this topic are about the item RegEx for DBAs

LinkedIn Profile

Newbie on www.simple-talk.com
rraja
rraja
SSC-Enthusiastic
SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)SSC-Enthusiastic (126 reputation)

Group: General Forum Members
Points: 126 Visits: 122
Very nice Article.:-)
david.rowland
david.rowland
Valued Member
Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)Valued Member (66 reputation)

Group: General Forum Members
Points: 66 Visits: 101
Although RegEx is powerful, I have tried using it once and found that coding my own CLR function to manipulate strings to what I want was 100 times faster than the RegEx function was.

If you are running the code against a small dataset this should not matter however I was running the code against millions of rows and this had a major impact on performance.
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)

Group: General Forum Members
Points: 88492 Visits: 41130
Great and very nicely put together article, David. My hat is off to you. But what of performance? Have you done any comparisons between the Regex function and a moderately complex PatIndex?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)SSC Guru (88K reputation)

Group: General Forum Members
Points: 88492 Visits: 41130
david.rowland (5/7/2012)
Although RegEx is powerful, I have tried using it once and found that coding my own CLR function to manipulate strings to what I want was 100 times faster than the RegEx function was.

If you are running the code against a small dataset this should not matter however I was running the code against millions of rows and this had a major impact on performance.


What did your custom CLR do to beat Regex so soundly?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
corey lawson
corey lawson
SSC-Addicted
SSC-Addicted (438 reputation)SSC-Addicted (438 reputation)SSC-Addicted (438 reputation)SSC-Addicted (438 reputation)SSC-Addicted (438 reputation)SSC-Addicted (438 reputation)SSC-Addicted (438 reputation)SSC-Addicted (438 reputation)

Group: General Forum Members
Points: 438 Visits: 564
I've used our own in-house implementations of some of the .Net RexEx functions on seemingly large datasets (several million records each), mostly to help clean up, validate and parse addresses, city-state-zip (or province), etc.

Beats pure SQL functions like Hulk did to Loki in the movie, especially in understandability. I'll take teasing a complicated Rexex pattern over scads of nested SQL functions trying to do the same thing.

one I really have done lots of is to do is keep expanding on a pattern like this: '(Company|corp|corporation|inc|llp|llc...)'

beats continuing to add "or field like '%new_match%' ..."
borisk 57245
borisk 57245
Grasshopper
Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)Grasshopper (18 reputation)

Group: General Forum Members
Points: 18 Visits: 39
RegEx is Thee best tool in order to find any pattern within given string[did couple of test including .Net String manipulation and limited RegEx in SQL , Like/PATIndex....], and it proves to be much faster than any other way.
in regards to good editor, try this one: http://www.ultrapico.com/Expresso.htm

moreover, consider using FullText indexing from time to time, as it might even ease searching for Words other than using RegEx.
Jeffrey Bijkerk
Jeffrey Bijkerk
Grasshopper
Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)Grasshopper (10 reputation)

Group: General Forum Members
Points: 10 Visits: 34
If you use regex combined with a where clause it can not use the index anymore. I think this is very important.
williams.frd
williams.frd
Forum Newbie
Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)

Group: General Forum Members
Points: 7 Visits: 13
Am I out of line in thinking this is something that should be handled in a high-level object-oriented language?
paul.knibbs
paul.knibbs
SSCrazy
SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)SSCrazy (2.6K reputation)

Group: General Forum Members
Points: 2607 Visits: 6232
I'm not particularly familiar with the CLR--does processing stuff via RegEx in this way effectively mean you're using RBAR, or is it a bit cleverer than that?
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search