string extraction help

  • Hi,

    I have some code in c# which extracts the text from a string that complies with these wildcards

    ([A-Za-z]{3,6}_?[0-9]{4,7})

    to break this down

    first 3 to 6 characters needs to be upper or lower case letters

    then there could be an underscore

    then the next 4 to 7 characters have to be numbers

    Does anyone know how to create a sql equivalent. Should I use PATINDEX, if so how to I denote the max and min length?

    Many Thanks for your help, it is very much appreciated,

    Oliver

  • Can you use CLR? RegEx tends to be far easier to implement in C# than in pure t-sql. There are certainly some pattern matching techniques for some basics but I am unaware of more complicated matching like you have going on.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks for this, my CLR skills are non - existent but it is something I need to look into. I don't think such powerful query tools exist to do this in SQL, I guess.

    Many Thanks for the reply.

  • oliver.morris (1/14/2013)


    Thanks for this, my CLR skills are non - existent but it is something I need to look into. I don't think such powerful query tools exist to do this in SQL, I guess.

    Many Thanks for the reply.

    No they don't - this requirement is definitely a strong CLR candidate.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Or if you can use SSIS to achieve whatever it is that you are doing, there is at least one CodePlex add-on that would allow you to use RegEx matching without having to write code ...

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Many Thanks for your help, this is good to know. Will try the CLR route.

    Cheers

  • It might seem a bit tricky and you may need to play with this a bit to get it working 100%, but I think you'll get the idea of what I'm trying to do here pretty quickly.

    WITH SampleData (s) AS (

    -- RegEx: ([A-Za-z]{3,6}_?[0-9]{4,7})

    SELECT 'ABVAAA_1234567' -- valid

    UNION ALL SELECT 'AASDAAA_123' -- invalid

    UNION ALL SELECT 'ACd123' -- valid

    UNION ALL SELECT 'ACD_12345677' -- invalid

    UNION ALL SELECT '12345677' -- invalid

    )

    SELECT s, PosUnd, PosNum, PosAlph, LEN(s)

    ,IsValid=CASE

    WHEN PosUnd > 0 AND PosAlph <> 0 AND PosNum - PosUnd = 1 AND PosUnd - PosAlph <= 6 AND LEN(s) - PosNum <= 6

    THEN 1

    WHEN PosUnd = 0 AND PosAlph <> 0 AND PosUnd - PosAlph <= 7 AND LEN(s) - PosNum <= 7

    THEN 1

    --WHEN

    ELSE 0 END

    FROM SampleData

    CROSS APPLY (SELECT PATINDEX('%[_]%', s)) a(PosUnd)

    CROSS APPLY (SELECT PATINDEX('%[0-9]%', s)) b(PosNum)

    CROSS APPLY (SELECT PATINDEX('%[A-Za-z]%', s)) c(PosAlph)

    Alternatively, using CLRs and understanding how to write them don't need to go hand in hand. You just need to have permissions on your database to install them.

    Here's a pretty nice library called SQL# developed by Solomon Rutzky (SQL Sharp Library of CLRs[/url]) that is really clean to install, is well documented and has the functions in it that you'll need to perform this validity check directly with the RegEx pattern you've provided. RegEx validations will probably be more costly in terms of CPU than doing something like what I provided, but you can always test that assumption (and you should) to be sure.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I see what you are up to here. This is a really neat method and will work it up. Thank you for sending over the CLR reference, I will check this out as well.

    I really appreciate your help,

    Oliver

  • oliver.morris (1/15/2013)


    I see what you are up to here. This is a really neat method and will work it up. Thank you for sending over the CLR reference, I will check this out as well.

    I really appreciate your help,

    Oliver

    Happy to be of service! 😎


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply