Convert teradata REGEXP_INSTR into SQL SERVER

  • Hi,

    I need some guidelines to convert Teradata REGEXP_INSTR logic into SQL server code. Please find the attached file in which I had provided a sample script.

    Appreciate your reply in advance.

    Thanks

     

     

  • Not allow me to attahed .sql file so copy paste code here.

    SELECT case when REGEXP_INSTR(string, '/ OB/') > 0

    OR REGEXP_INSTR(string, '/OB/') > 0

    OR REGEXP_INSTR(string, 'OB//') > 0

    OR REGEXP_INSTR(string, 'T/O') > 0

    OR REGEXP_INSTR(string, '/ OB') > 0

    OR REGEXP_INSTR(string, 'ON OB') > 0

    THEN 'OB'

    when REGEXP_INSTR(string, 'TP[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'RR[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'RRO/[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'TP [0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'TP NUM[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'RRA[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, '/R[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'TP NO[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'TP NBR[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'RR NBR[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'RRO[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, '/RRO[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, '/RR [0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'RR[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'TN-[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'TP NO.[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'TRIP PASS[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'TPAUTH[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'TPNO[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'TAO[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, '/RPO[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'RRP[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'TP/[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, '/RFA[0-9]{5,7}') > 0

    OR REGEXP_INSTR(string, 'RRA[0-9]{5,7}') > 0

    THEN 'TP'

    ELSE 'PP' END XYZ

    FROM TABLE A

  • Think the easiest way will be to break out what each REGEX expression you indicated means and build up case statements that reflect this.  I am just going to help with the first one and let you go from there.  The "teach a man to fish" approach.  So you have this:

    SELECT case when REGEXP_INSTR(string, '/ OB/') > 0

    OR REGEXP_INSTR(string, '/OB/') > 0

    OR REGEXP_INSTR(string, 'OB//') > 0

    OR REGEXP_INSTR(string, 'T/O') > 0

    OR REGEXP_INSTR(string, '/ OB') > 0

    OR REGEXP_INSTR(string, 'ON OB') > 0

    THEN 'OB'

    So, lets grab each of these in turn and parse them.  /OB/ is the first one.  Now, without knowing 100% how teradata handles this or what your data looks like, I am going to assume that those are all literal things you are looking up.  That is you are looking up the literal string /OB/.  The reason I think this is because /OB/ is not valid REGEX (as far as I know and based on the online regex tool I was testing this with).  So, what I am thinking is that this is really doing "INSTR" (in string) and not actual REGEX based on how you wrote it.  So to convert the above to TSQL, something like this would work:

    SELECT CASE WHEN string LIKE '%/OB/%' 
    OR string LIKE '%OB//%'
    OR string LIKE '%T/O%'
    OR string LIKE '%/ OB%'
    OR string LIKE '%ON OB%'
    THEN 'OB'

    Now, if you are using actual REGEX (or some modified form of Regex used by teradata), you will need to tweak the above.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • Thanks for your reply, Mr. Brian. I got your point. I will first check with the client to run this script on their Teradata environment and based on the result will apply to convert logic.

  • You can create your own INSTR function leveraging ngrams8k.

    CREATE OR ALTER FUNCTION dbo.Instr8K
    (
    @string VARCHAR(8000),
    @search VARCHAR(100),
    @instance INT
    )
    /*
    Created by Alan Burstein, 20210406
    Similar to Teradata and Oracles INSTR and REGEX_INSTR functions
    */
    RETURNS TABLE AS RETURN
    SELECT Position = ISNULL(MAX(f.Position),0)
    FROM
    (
    SELECT ng.Position, ng.Token, Instance = ROW_NUMBER() OVER (ORDER BY (ng.Position))
    FROM samd.ngrams8k(@string,DATALENGTH(@search)) AS ng
    WHERE ng.Token = @search
    ) AS f
    WHERE f.instance = ISNULL(@instance,1);
    GO

    Here's how you would find the position of the 1st, 2nd, 3rd OR 4th instance of the text "AB" inside the string  "ABCXYZ123ABCAB".

    DECLARE @string   VARCHAR(1000) = 'ABCXYZ123ABCAB',
    @search VARCHAR(100) = 'AB';

    SELECT instr.Position FROM dbo.Instr8K(@string,@search,1) AS instr;
    SELECT instr.Position FROM dbo.Instr8K(@string,@search,2) AS instr;
    SELECT instr.Position FROM dbo.Instr8K(@string,@search,3) AS instr;
    SELECT instr.Position FROM dbo.Instr8K(@string,@search,4) AS instr;

    This shows that the first instance of AB is at position 1, the second at position 10, 13 for the third and 0 for the 4th instance as there are only three. Now for a simplified version of your problem which should help you get started.

    DECLARE @things TABLE (String VARCHAR(1000) UNIQUE);
    INSERT @things
    VALUES('>>>/OB///'),(' /OBXYZ'),('OB///GYN'),(''),('(^*^)'),('555'),('{5,7}'),('{5,5}'),
    ('MOT/O'),('88.{5,7},{5,9}')

    SELECT
    t.String, p.Pattern, i.Position
    FROM @things AS t
    CROSS JOIN (VALUES('/OB/'),('OB//'),('/OB'),('T/O')) AS p(Pattern)
    CROSS APPLY dbo.Instr8K(t.String,p.Pattern,1) AS i
    WHERE i.Position > 0;

    This returns:

    String       Pattern Position
    ------------ ------- ----------
    /OBXYZ /OB 2
    >>>/OB/// /OB/ 4
    >>>/OB/// OB// 5
    >>>/OB/// /OB 4
    MOT/O T/O 3
    OB///GYN OB// 1

    For a distinct list of matches I can modify my query like so:

    SELECT TOP(1) WITH TIES
    t.String, p.Pattern, i.Position
    FROM @things AS t
    CROSS JOIN (VALUES('/OB/'),('OB//'),('/OB'),('T/O')) AS p(Pattern)
    CROSS APPLY dbo.Instr8K(t.String,p.Pattern,1) AS i
    WHERE i.Position > 0
    ORDER BY ROW_NUMBER() OVER (PARTITION BY t.String ORDER BY Position);
    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply