Programming Regular Expressions

  • Little backround:

    mySQL on Unix.

    SQL Server 2000.

    Scanners read a MICR nunber into mySQL database and then it is uploaded to SQL 2000 database.

    For those of you who do not know a MICR is the routing number plus account and check number. Every bank has a differnet MICR.

    The programmer already has set Reg Ex for 90% of our clients.

    I created a database for the uncommon MICRs.

    database for 'uncommon' MICRs does the following:

    Script the uncommon MICRs to 'CNNNNNNNNNC NNNNNNNNNL NNN' where C = : N = number L = < etc. What my goal is, if a MICR (which is substring('CNNNNNNNNNC',2,9) has a match and the count is X then script sends me an email to put in the Reg Ex to the program. Sometimes I have 50 plus reg ex to program. I was wondering if there was a way to program the Reg Ex through SQL? I do not want the answer on how to do it, just some articles or such to point me in the right direction. Or is this even possible? If there is any confusion on my question, please let me know Thank you
    Christi

  • Since no one answered, I am wondering if my question was unclear.

    for an example here is a sample of a micr that is read into the database:

    :012345678: 80110627841< 1947

    the routing number would be: 012345678

    the first part of my RegEx would be : :012345678:

    that would be literal

    so I would start out at ^[:012345678:]

    what I would like to do is put the routing number into a variable @routing

    then the rest of the string: 80110627841< 1947

    I would like to build a RegEx. Because the second part can be anyones account number it is variable. So I created in SQL a pattern. NNNNNNNNNNNL NNNN

    How can I turn 'NNNNNNNNNNNL NNNN' into a RegEx? because NNNNNNNNNNNL NNNN is a variable number?

    Am I making sense or am I being more confusing? Is this possible? yes or no would be appreciated before I spend 20 or more hours trying to do the impossible.

  • I can't speak to a SQL 2000 solution,

    but in 2005 you can use a CLR function. See http://msdn.microsoft.com/msdnmag/issues/07/02/SQLRegex/default.aspx for more info on that.

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • I don't think there is an easy native way to do this in SQL Server. Recently I was trying to do some pattern matching on a text field and after spending several hours trying to get patindex, substring, replace to do some basic pattern matching, I finally gave up, wrote the output to a text file and used a sed one-liner. Frankly SQL Server and Windows is terrible at handling regular expressions. I did notice this article on SQL Server Central which implements regular expressions through extended stored procedures. If you can get this installed it will do what you want:

    http://www.sqlservercentral.com/columnists/mcoles/sql2000dbatoolkitpart2.asp

  • Unless there is more complex validation, I would not use regx, I would

    create errormicrtable to contain invalid data

    create micrtable containing valid MICR codes

    and use the following to validate and test MICR input

    INSERT INTO [errormicrtable]

    SELECT [micr] FROM [inputtable]

    WHERE [micr] NOT LIKE ':[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]: [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]< [0-9][0-9][0-9][0-9]'

    INSERT INTO [newmicrtable]

    SELECT a.[micr] FROM [inputtable] a

    WHERE a.[micr] LIKE ':[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]: [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]< [0-9][0-9][0-9][0-9]'

    AND NOT EXISTS (SELECT b.[micr] FROM [micrtable] b WHERE b.[micr] = SUBSTRING(a.[micr],2,9))

    GROUP BY a.[micr]

    HAVING COUNT(*) = [X]

    not sure why you only want to know missing MICR codes when count = X

    but the above will provide you with data to work with and you can include missing/new MICR codes by inserting them into the micrtable and deleting them from the newmicrtable

    p.s. it is easier to maintain a table than regex

    Far away is close at hand in the images of elsewhere.
    Anon.

  • In our program, which is written in Java and runs on a Unix system, the programmer has created RegEx for the MICR patterns which can parse about 90% of our clients. My assignment was to upload a file that had unknown MICR patterns where the program could not process. IF the MICR pattern has say a count of 5000 then I need to create a RegEx to give to the programmer. There are about 50 or so of them right now. I didn't feel like writting 50 RegEx, I was hoping to have them created for me through SQL.

  • If you absolutely must, you can use the sp_OACreate procedure to instantiate a RegEx object in a T-SQL stored procedure.  You can then access its properties and methods with other sp_OAxxx procedures, just don't forget to dispose of it with sp_OADestroy.  However, this instantiates COM objects in the SQL Server process and can cause instability if the objects are not well-behaved or if you don't use them correctly.

    You could use RegEx objects in an ActiveX script task in a DTS package.

    You could create an external program using RegEx objects in the language and data access method of your choice, and invoke it with xp_cmdshell.

    As an example of using the Object Automation procedures, I have used this function to get file creation dates with the FileSystemObject.

    CREATE

    FUNCTION ufn_GetFileCreationDate (@filename varchar(1024))

    RETURNS datetime

    AS BEGIN

        DECLARE @rv int 

        DECLARE @fso int 

        DECLARE @file int 

        DECLARE @CRDate datetime 

        EXEC @rv = sp_OACreate 'Scripting.FileSystemObject', @fso OUT 

        IF @rv = 0 BEGIN 

            EXEC @rv = sp_OAMethod @fso, 'GetFile', @file OUT, @filename

            IF @rv = 0 BEGIN 

                EXEC @rv = sp_OAGetProperty @file, 'DateCreated', @CRDate OUT

                EXEC @rv = sp_OADestroy @file 

            END 

            EXEC @rv = sp_OADestroy @fso 

        END

        RETURN @CRDate

    END

  • Hey, that's my checking account.

    Are you trying to write T-SQL code that will create REGEXes for you, as opposed to getting SQL to parse with REGEXes?

    And you want to count the number of occurences of :ABCDEFGHIJKL: in the first part of a table of strings, taking action when there are more than 5000?

    I'm confused.

     

     

     

  • Are you trying to write T-SQL code that will create REGEXes for you, as opposed to getting SQL to parse with REGEXes?

    YES!!!!!

    And you want to count the number of occurences of :ABCDEFGHIJKL: in the first part of a table of strings, taking action when there are more than 5000?

    Already taken care of.

  • I hope someone can help me.

    This is what I am working on, and if I am not explaining it correctly, PLEASE ask.

    In my company we scan checks. At the bottom of the check is a MICR number. No problem. Those are inserted into a database.

    I then have another t-sql script that changes the MICR.

    for example:

    :123456789: 000123456789

    is changed to

    CNNNNNNNNNCSNNNNNNNNNNNNL

    NOW, this is my quandry, how would I change

    CNNNNNNNNNCSNNNNNNNNNNNNL

    into

    ^: ([0-9]{9}): *([0-9]+)

    I have the first part

    ^: ([0-9]{9}):

    (extra space included so was not seen. ) where do I find out how to format code one this website?

    already, as that is static and the routing number. I have also parsed the data into another field, so it would be:

    SNNNNNNNNNNNNL

    into

    *([0-9]+)

    where:

    S = _* (ignore the _, just use a literal space)*

    N = [0-9]

    L =<

    My biggest problem is counting in the string and finding the next character. If it starts with S fine, I can program that, but how do I count or find out what the next character is?

    Am I making any sense?

    I am building a Regular expression string, if that helps.

    I have vb.net that is checking making sure the regex parses correctly. I just don't want to build (now) 60 reg ex. Yes or No can I do this in SQL? Come on all of you SQL Gurus!

    Actually, I do not want someone giving me the answer, just point me in the right direction.

  • I am a bit confused, the regex you say you want to get to, ie

    ^: ([0-9]{9}): *([0-9]+)

    will match any number combination

    or are you saying that the : may be different char or the number of digits {9} may be different

    can you post some varied examples and the expected regex

    to find a char in a string you can use CHARINDEX

    to pattern match you can use PATINDEX, ie

    PATINDEX('^[0-9]',column) will find the position of the first char that is not 0 to 9 and in combination with SUBSTRING, PATINDEX('^[0-9]',SUBSTRING(column,2,255)) will find the position of the first char that is not 0 to 9 ignoring the first char

    Far away is close at hand in the images of elsewhere.
    Anon.

  • David

    I have this in a field in my table

    CNNNNNNNNNCSNNNNNNNNNNNNL

    I also have a field that gives the LEN of that field.

    The first 11 characters will always be:

    CNNNNNNNNNC

    and the code would be:

    ^: ([0-9]{9}):

    What I would like to do is go through the rest of the string.

    The rest of the string is:

    SNNNNNNNNNNNNL

    I would like to look at the first character, in this case it's 'S'. I would then like to look at the next character, it's 'N', they do not match so I would stop the search.

    So, I would program (space)*

    Then I would look at the next character, it's 'N'. The character after that is also 'N', it's actually 'N' for 12 characters.

    So, I would program ([0-9]+)

    The character after the 12 'N's is L

    So, I would program <

    The end result would be

    ^: ([0-9]{9}): *([0-9]+)<$

    My issue is searching and stopping in the string when the character changes. How would I do that?

  • I also want to say that the dataset will always be small. Less then 100 so I do not mind using a cursor.

    in order to code the : CNNNNNNNNNCSNNNNNNNNNNNNL I created this script

    DECLARE @MICR varchar(60),

    @MICR_LEN int,

    @pos int,

    @value char(1),

    @pattern varchar(60),

    @expression varchar(60),

    @i tinyint

    SET @expression = ''

    DECLARE MICR_cursor CURSOR FOR

    SELECT MICR, MICR_LEN

    FROM dbo.MICRs_test

    where MICR not like '%=%'

    OPEN MICR_cursor

    FETCH NEXT FROM MICR_cursor

    INTO @MICR, @MICR_LEN

    WHILE @@FETCH_STATUS =0

    BEGIN

    SET @value = ''

    SET @pattern = ''

    WHILE @pos <= @MICR_LEN

    BEGIN

    SET @value = substring(@MICR,@pos,1)

    SELECT @i = COUNT(pValue)

    FROM Patterns

    WHERE pValue = @value

    IF @i = 1

    BEGIN

    SELECT @pattern = pattern

    FROM Patterns

    WHERE pValue = @value

    END

    ELSE

    BEGIN

    SET @pattern = 'U'

    END

    --print @pattern + ' ' + @value

    SET @pos = @pos + 1

    SET @expression = @expression + @pattern

    END

    --PRINT @MICR_LEN

    --PRINT @value

    --PRINT @pos

    --PRINT @expression

    INSERT dbo.MICR_Patterns_temp

    VALUES(@MICR,@expression)

    SET @expression = ''

    SET @pos = 1

    FETCH NEXT FROM MICR_cursor

    INTO @MICR, @MICR_LEN

    END

    CLOSE MICR_cursor

    DEALLOCATE MICR_cursor

  • Lets see if this will help.

    use the SUBSTRING(input_str,12,len(input_string)) function to return the rest of the input string. The put the result string in a WHILE loop to find, count, parse whatever you want.

    I guess based on your comments, you can check if the first character is an 'S' (space?) doing a left(result_string,1) = char(32) if it is the initiate your loop.

    Other string functions are RIGHT, CHARINDEX, REVERSE,...

  • Camilo, that is what I think I need to do.

    I am working on this now, and I 'think' I have it.

    I will post code if I do.

    Otherwise I will come on here and ask for help!

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply