Split String Function

  • Comments posted to this topic are about the item Split String Function

  • This is a nice function but I don't see any recursion, at least not as I understand it. What exactly do you mean?

  • The recursion comes in with the multiple executions of the SUBSTRING function. By using the Numbers table, or the tally table generated on the fly depending on your implementation, we are able to walk through the string, character by character, executing the SUBSTRING function along the way to find our delimiter and parse out the value we want to output to the caller. Recursion can be a source of hidden RBAR (Row By Agonizing Row), but tends to be lighter weight than a WHILE LOOP or CURSOR.

  • I see. I'm not sure I'd call that recursion, but it's a neat trick.

    Here's another approach using a recursive CTE. I haven't coded it up as a function or catered for text qualified separators, but you get the idea:

    DECLARE @String VARCHAR(255) = '1,2,345,6,yyr';

    DECLARE @Separator VARCHAR(1)= ',';

    WITH CTE

    AS ( SELECT 1 AS PartNo ,

    SUBSTRING(@String + @Separator, 1,

    CHARINDEX(@Separator, @String + @Separator,

    1) - 1) AS Part ,

    SUBSTRING(@String,

    CHARINDEX(@Separator, @String + @Separator,

    1) + 1,

    LEN(@String + @Separator)

    - CHARINDEX(@Separator, @String + @Separator,

    1)) AS Remainder

    UNION ALL

    SELECT PartNo + 1 AS PartNo ,

    SUBSTRING(Remainder + @Separator, 1,

    CHARINDEX(@Separator, Remainder + @Separator,

    1) - 1) AS Part ,

    SUBSTRING(Remainder,

    CHARINDEX(@Separator, Remainder + @Separator,

    1) + 1,

    LEN(Remainder + @Separator)

    - CHARINDEX(@Separator,

    Remainder + @Separator, 1)) AS Remainder

    FROM CTE

    WHERE Remainder <> ''

    )

    SELECT PartNo ,

    Part

    FROM CTE

    (Apologies, this was beautifully formatted before I c&ped from SSMS)

  • A bit of Googling around this topic led me to this article, which is pretty comprehensive:

    http://sqlperformance.com/2012/07/t-sql-queries/split-strings

  • That is a great article. I hadn't read that before, but it definitely brings the performance of different approaches to light.

  • I recommend on this function(much better performance,Do it by XML query):

    CREATE FUNCTION dbo.uf_SplitStringsToTable_XML

    (

    @List NVARCHAR(MAX),

    @Delimiter NVARCHAR(255)

    )

    RETURNS TABLE

    AS

    RETURN

    (

    SELECT Data = y.i.value('(./text())[1]', 'INT')

    FROM

    (

    SELECT x = CONVERT(XML, '<i>'

    + REPLACE(@List, @Delimiter, '</i><i>')

    + '</i>').query('.')

    ) AS a CROSS APPLY x.nodes('i') AS y(i)

    );

  • Yes. That is better preforming as long as no XML reserved words are used.

    https://technet.microsoft.com/en-us/library/ms145315%28v=sql.90%29.aspx

  • -- Tally Table approach (No R-BAR optimal for millions of rows.)

    CREATE FUNCTION udf_splitString

    (

    @STR NVARCHAR(MAX),

    @sep NCHAR(1)

    )

    RETURNS TABLE

    AS

    RETURN(

    WITH cteTally(POS)

    AS

    (

    SELECT TOP (LEN(ISNULL(@sep + @STR + @sep, 0))) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))

    FROM SYS.COLUMNS a CROSS APPLY SYS.COLUMNS b

    )

    SELECT SUBSTRING(@sep + @STR + @sep, POS + 1, CHARINDEX(@sep, @sep + @STR + @sep, POS + 1) - POS -1) [Value]

    FROM cteTally

    WHERE POS <= LEN(@sep + @STR + @sep) - 1

    AND SUBSTRING(@sep + @STR + @sep, POS, 1) = @sep

    );

    I think this a much cleaner and simple approach towards achieving the same result. Please test it on different sets and let me know if any changes are needed.

  • Here are a few other approaches and there impacts:

    -- Recursive CTE approach (Hidden R-BAR optimal for thousands to few hundred thousand rows.)

    WITH cte_Split_String

    AS

    (

    SELECT CAST(0 AS BIGINT) AS idx1,CHARINDEX(@sep,@str) idx2

    UNION ALL

    SELECT idx2 + 1,CHARINDEX(@sep,@str,idx2+1)

    FROM cte_Split_String

    WHERE idx2>0

    )

    INSERT INTO @value

    SELECT SUBSTRING(@str,idx1,COALESCE(NULLIF(idx2,0),LEN(@str)+1)-idx1) [value]

    FROM cte_Split_String

    OPTION (MAXRECURSION 0)

    -- XML transform approach (Transform overhead and delimiter restrictions)

    DECLARE @xml XML = (SELECT CONVERT(XML,'<r>' + REPLACE(@str,@sep,'</r><r>') + '</r>'))

    INSERT INTO @value(Value)

    SELECT t.value('.','NVARCHAR(MAX)')

    FROM @xml.nodes('/r') AS x(t)

  • Also take a look at this article Tally OH! An Improved SQL 8K “CSV Splitter” Function[/url]

  • Probably the best you will find out there: http://www.sqlservercentral.com/articles/Tally+Table/72993/

  • Good function, thanks.

  • I would suggest looking at Jeff Moden's function

    Tally OH! An Improved SQL 8K “CSV Splitter” Function at http://www.sqlservercentral.com/articles/Tally+Table/72993/

  • unComplicate:

    create table #tmp (x nvarchar(max))

    declare @s-2 nvarchar(max) = 'a,1,2,x,3,4,z'

    declare @aux nvarchar(max)

    while charindex(',',@s) > 0

    begin

    set @aux = substring(@s,0,charindex(',',@s))

    SET @s-2 = LTRIM(STUFF(@s,1,len(@aux)+1,''))

    insert into #tmp select @aux

    end

    insert into #tmp select @s-2

    select x from #tmp

    drop table #tmp

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply