Please help to get, Alphabet with underscore i.e special charecter

  • Steve Collins wrote:

    Afaik this actually produces the desired output.  Should work with SQL 2014.  It uses the ordinal splitter DelimitedSplit8K_Lead.  The query: splits the input string on '_' underscore, finds the offset location of the first integer within the split string, substrings to extract the lead characters, and removes the file extension.  Then it concatenates the calculated strings 'string_wo_ext' separated by underscores using the FOR XML trick.

    declare @string varchar(255) = 'EMA_20210526T211254_0000_MRNMM0000001240_PMS110460PAT000001287_PID15235307_OR_Driver_Licensefrom.pdf';

    select stuff((select '_' + calc.string_wo_ext
    from dbo.DelimitedSplit8K_Lead(@string, '_') ds
    cross apply (values (patindex('%[0-9]%', ds.Item))) frst(tndx)
    cross apply (values (iif(frst.tndx=0, Item, substring(ds.Item, 1, frst.tndx-1)))) string(string_to_agg)
    cross apply (values (left(string.string_to_agg, len(string.string_to_agg) - charindex('.', reverse(string.string_to_agg))))) calc(string_wo_ext)
    where Item like '[A-Za-z_]%'
    order by ds.ItemNumber
    for xml path('')), 1, 1, '') string_val;
    CREATE FUNCTION [dbo].[DelimitedSplit8K_LEAD]
    --===== Define I/O parameters
    (@pString VARCHAR(8000), @pDelimiter CHAR(1))
    RETURNS TABLE WITH SCHEMABINDING AS
    RETURN
    WITH E1(N) AS (
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
    ), --10E+1 or 10 rows
    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
    cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front
    -- for both a performance gain and prevention of accidental "overruns"
    SELECT 0 UNION ALL
    SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
    ),
    cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
    SELECT t.N+1
    FROM cteTally t
    WHERE (SUBSTRING(@pString,t.N,1) = @pDelimiter OR t.N = 0)
    )
    --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
    SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),
    Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,8000))
    FROM cteStart s
    ;

    THAT, good Sir, is a beautiful thing!  All you need to do now is give the author of that good function a little credit by citing the article where you got it from 😀

    Really nice job on the use of Table Valued Constructors in the Cross Applys to simplify the code.  It does have an issue if an underscore is included in the extension but really nice job, IMHO.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Steve Collins wrote:

    Instead of a WHILE loop you could try a numbers table or tally function approach

    declare @string varchar(255) = 'EMA_20210526T211254_0000_MRNMM0000001240_PMS110460PAT000001287_PID15235307_OR_Dr';

    select string_agg(v.chr, '') within group (order by fn.n) az_only
    from dbo.fnTally(1, len(@string)) fn
    cross apply (values (substring(@string, fn.n, 1))) v(chr)
    where v.chr like '[A-Za-z_]';

    I like this approach and modified it a bit to handle the file extension.

    /* SQL 2017 + */
    declare @string varchar(255) = 'EMA_20210526T211254_0000_MRNMM0000001240_PMS110460PAT000001287_PID15235307_OR_Driver_Licensefrom.pdf';
    ;with cteExtension as (
    Select case when CHARINDEX('.', REVERSE('.' + @string)) < 10 then RIGHT(@string, CHARINDEX('.', REVERSE('.' + @string)))
    else ''
    end as extension
    )
    , cteStringNoExtension as (
    Select case EX.extension when '' then LEFT(@string, LEN(@string) )
    else LEFT(@string, LEN(@string) - LEN(EX.extension))
    end as NoExtension
    from cteExtension EX
    )
    , cteString as (
    select string_agg(v.chr, '') within group (order by fn.n) az_only
    from cteStringNoExtension NE
    cross apply dbo.fnTally(1, len(NE.NoExtension)) fn
    cross apply (values (substring(NE.NoExtension, fn.n, 1))) v(chr)
    where v.chr like '[A-Za-z_]'
    )
    Select isnull(az_only,'') + EX.extension az_only
    from cteString
    cross apply cteExtension EX
    ;

     

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Jeff Moden wrote:

    THAT, good Sir, is a beautiful thing!  All you need to do now is give the author of that good function a little credit by citing the article where you got it from 😀

    Really nice job on the use of Table Valued Constructors in the Cross Applys to simplify the code.  It does have an issue if an underscore is included in the extension but really nice job, IMHO.

    It's from Eirikur Eiriksson's 2014 SSC article.  Sorry about that.  This query was the last thing I did before falling asleep last night.  Long day

    Thanks for noticing the code!  It's appreciated especially from you Jeff.  Many thanks for the compliment.  Also, thanks for everything you do 🙂  SSC forums are the top in SQL imo

    Aus dem Paradies, das Cantor uns geschaffen, soll uns niemand vertreiben können

  • Thanks for the feedback Steve.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This was removed by the editor as SPAM

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply