Split string using multiple delimiters - version 2

  • Comments posted to this topic are about the item Split string using multiple delimiters - version 2

  • Interesting. It does a lot of work for the result. Have you speed-tested it against the house splitter?

    Here's one I knocked up in a couple of minutes to show you what I mean:

    create FUNCTION [dbo].[il_SplitDelimitedStringArray_MD]

    --===== Define I/O parameters

    (@pString VARCHAR(8000), @pDelimiter VARCHAR(32))

    RETURNS TABLE WITH SCHEMABINDING AS

    RETURN

    --===== "Inline" CTE Driven "Tally Table” produces values from 0 up to 10,000...

    -- enough to cover VARCHAR(8000)

    WITH E1(N) AS (

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL

    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1

    ), --10E+1 or 10 rows

    E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows

    E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max

    cteTally(N) AS (--==== This provides the "zero base" and limits the number of rows right up front

    -- for both a performance gain and prevention of accidental "overruns"

    SELECT 0 UNION ALL

    SELECT TOP (DATALENGTH(ISNULL(@pString,1))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4

    ),

    cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)

    SELECT t.N+1

    FROM cteTally t

    WHERE @pDelimiter LIKE '%'+SUBSTRING(@pString,t.N,1)+'%' OR t.N = 0

    )

    --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.

    SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY s.N1),

    Item = SUBSTRING(@pString,s.N1,ISNULL(NULLIF((LEAD(s.N1,1,1) OVER (ORDER BY s.N1) - 1),0)-s.N1,8000))

    FROM cteStart s

    ;

    SELECT * FROM dbo.il_SplitDelimitedStringArray_MD(

    'http://www.google.com;http://www.yahoo.com|http://www.msn.com~http://www.twitter.com,http://www.facebook.com~http://www.sqlservercentral.com;http://www.social.technet.microsoft.com,http://www.sqlmag.com;http://www.sqlperformance.com,http://www.sqlteam.com',

    ';|,~')

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (9/15/2016)


    Interesting. It does a lot of work for the result. Have you speed-tested it against the house splitter?

    Here's one I knocked up in a couple of minutes to show you what I mean:

    You chose an interesting way of saying this.

  • Iwas Bornready (9/15/2016)


    ChrisM@Work (9/15/2016)


    Interesting. It does a lot of work for the result. Have you speed-tested it against the house splitter?

    Here's one I knocked up in a couple of minutes to show you what I mean:

    You chose an interesting way of saying this.

    Care to elaborate? How about setting up the performance comparison? 😀

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Wonderful work, what if we have a new line or any other escape sequences as a delimiter?

  • I got the solution for this , simply append the what ever needed escape sequences at the end of specified string of delimiters , Thank you

  • ChrisM@Work wrote:

    Iwas Bornready (9/15/2016)


    ChrisM@Work (9/15/2016)


    Interesting. It does a lot of work for the result. Have you speed-tested it against the house splitter? Here's one I knocked up in a couple of minutes to show you what I mean:

    You chose an interesting way of saying this.

    Care to elaborate? How about setting up the performance comparison? 😀

    Old post, I know but, to explain the likely reason for the comment, "Knocked Up" has a seriously different "slang" meaning here in the states. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply