SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


RegEx for DBAs


RegEx for DBAs

Author
Message
Dave Poole
Dave Poole
SSCoach
SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)SSCoach (16K reputation)

Group: General Forum Members
Points: 16960 Visits: 3403
Comments posted to this topic are about the item RegEx for DBAs

LinkedIn Profile
www.simple-talk.com
rraja
rraja
Old Hand
Old Hand (342 reputation)Old Hand (342 reputation)Old Hand (342 reputation)Old Hand (342 reputation)Old Hand (342 reputation)Old Hand (342 reputation)Old Hand (342 reputation)Old Hand (342 reputation)

Group: General Forum Members
Points: 342 Visits: 122
Very nice Article.:-)
david.rowland
david.rowland
SSC-Enthusiastic
SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)SSC-Enthusiastic (116 reputation)

Group: General Forum Members
Points: 116 Visits: 101
Although RegEx is powerful, I have tried using it once and found that coding my own CLR function to manipulate strings to what I want was 100 times faster than the RegEx function was.

If you are running the code against a small dataset this should not matter however I was running the code against millions of rows and this had a major impact on performance.
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)

Group: General Forum Members
Points: 218293 Visits: 41995
Great and very nicely put together article, David. My hat is off to you. But what of performance? Have you done any comparisons between the Regex function and a moderately complex PatIndex?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)SSC Guru (218K reputation)

Group: General Forum Members
Points: 218293 Visits: 41995
david.rowland (5/7/2012)
Although RegEx is powerful, I have tried using it once and found that coding my own CLR function to manipulate strings to what I want was 100 times faster than the RegEx function was.

If you are running the code against a small dataset this should not matter however I was running the code against millions of rows and this had a major impact on performance.


What did your custom CLR do to beat Regex so soundly?

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
corey lawson
corey lawson
SSC Eights!
SSC Eights! (949 reputation)SSC Eights! (949 reputation)SSC Eights! (949 reputation)SSC Eights! (949 reputation)SSC Eights! (949 reputation)SSC Eights! (949 reputation)SSC Eights! (949 reputation)SSC Eights! (949 reputation)

Group: General Forum Members
Points: 949 Visits: 576
I've used our own in-house implementations of some of the .Net RexEx functions on seemingly large datasets (several million records each), mostly to help clean up, validate and parse addresses, city-state-zip (or province), etc.

Beats pure SQL functions like Hulk did to Loki in the movie, especially in understandability. I'll take teasing a complicated Rexex pattern over scads of nested SQL functions trying to do the same thing.

one I really have done lots of is to do is keep expanding on a pattern like this: '(Company|corp|corporation|inc|llp|llc...)'

beats continuing to add "or field like '%new_match%' ..."
borisk 57245
borisk 57245
SSC Rookie
SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)SSC Rookie (42 reputation)

Group: General Forum Members
Points: 42 Visits: 39
RegEx is Thee best tool in order to find any pattern within given string[did couple of test including .Net String manipulation and limited RegEx in SQL , Like/PATIndex....], and it proves to be much faster than any other way.
in regards to good editor, try this one: http://www.ultrapico.com/Expresso.htm

moreover, consider using FullText indexing from time to time, as it might even ease searching for Words other than using RegEx.
Jeffrey Bijkerk
Jeffrey Bijkerk
Grasshopper
Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)Grasshopper (22 reputation)

Group: General Forum Members
Points: 22 Visits: 34
If you use regex combined with a where clause it can not use the index anymore. I think this is very important.
williams.frd
williams.frd
Grasshopper
Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)Grasshopper (15 reputation)

Group: General Forum Members
Points: 15 Visits: 13
Am I out of line in thinking this is something that should be handled in a high-level object-oriented language?
paul.knibbs
paul.knibbs
SSCarpal Tunnel
SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)SSCarpal Tunnel (4.3K reputation)

Group: General Forum Members
Points: 4270 Visits: 6240
I'm not particularly familiar with the CLR--does processing stuff via RegEx in this way effectively mean you're using RBAR, or is it a bit cleverer than that?
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search