Convert String to a Table using CTE

  • Comments posted to this topic are about the item Convert String to a Table using CTE

  • The approach mentioned using CTE does not work for string (@array) which has more than 100 numbers. (e.g '1,2,3,4,5,....,101'). SQL Server returns following error message -

    "Msg 530, Level 16, State 1, Line 1

    The statement terminated. The maximum recursion 100 has been exhausted before statement completion."

    In order to overcome this error following query hint need to be used in Select statement with max value of recursion -

    option (MAXRECURSION 32767)

    Note : This is not a generic solution, since the query will fail again if the string (@array) has more than 32767 numbers which form a comma separated string.

  • This would fail if you have only one item in the @array var without a trailing comma e.g:

    DECLARE @array VARCHAR(max)

    SET @array = '1'

    SELECT item FROM strToTable(@array,',')

    This would break the SUBSTRING statements. The function requires that there is a trailing delimiter e.g:

    DECLARE @array VARCHAR(max)

    SET @array = '1,'

    SELECT item FROM strToTable(@array,',')

    It would be nice if the function could handle a single item in the @array var without a trailing delimiter.

    Myles J

  • Another variant

    create FUNCTION [dbo].[ft_ConvertStrToTable](

    @STR varchar(8000), -- ?????? ? ?????????????

    @sDiv char(1) -- ?????? ???????????

    )

    RETURNS

    @ConvertStrToTable TABLE

    (

    sValue varchar(255)

    )

    AS

    BEGIN

    Declare @Pos int, @PosPrev int;

    set @STR = @STR+@sDiv;

    set @Pos = 1;

    set @PosPrev = @Pos;

    while 1=1

    begin

    set @Pos = CHARINDEX(@sDiv, @STR, @PosPrev);

    if @Pos = 0

    break;

    insert into @ConvertStrToTable (sValue)

    values(substring(@Str, @PosPrev, @Pos-@PosPrev));

    set @PosPrev = @Pos+1;

    end;

    RETURN

    END

  • Interesting idea indeed! However there are obviously some limitations (maxrecursion being one of those), plus the performance 'may' not be up to the mark.

    The best method to split a integer CSV to a string (w.r.t. performance) is the 'crude' one i.e. finding and extracting integers one by one (see below sample user defined function). We have compared various approaches and found this method to be most performing of all the implementations.

    Create Function dbo.udf_1 (

    @csv varchar(max)

    )

    Returns @tbl Table (IntValue int)

    As

    Begin

    If( @csv Is Null Or Len(@csv) = 0 )

    Return

    Declare @iStartPosition int, @iEndPosition int,

    @vcTmpId varchar(15), @cDelimiter char(1)

    Select @iStartPosition = 1, @cDelimiter = ',', @iEndPosition = charindex( @cDelimiter, @csv )

    While @iEndPosition <> 0

    Begin

    Select @vcTmpId = substring(@csv, @iStartPosition, @iEndPosition - @iStartPosition)

    Select @iStartPosition = @iEndPosition + 1

    Insert Into @tbl Values( @vcTmpId )

    Select @iEndPosition = charindex( @cDelimiter, @csv, @iStartPosition )

    End

    Select @vcTmpId = substring(@csv, @iStartPosition, Len(@csv) - @iStartPosition + 1)

    Insert Into @tbl Values( @vcTmpId )

    Return

    End

    Go

    -- Usage

    Select * From dbo.udf_1('1,2,2342,3534,46546,4354,22')

    Go

  • with the hint

    OPTION(MAXRECURSION 0)

    no limits for recursion.

    the problem of the recursion error should be solved ...

  • Ok.

    And I've got a really quick one for this solution if you start dealing with large numbers of items.

    Instead of comma separating as they get passed in, turn them into 10 char padded with spaces, such that the

    1st id stored from chars 1-10

    2nd from 11-20

    and so on

    then the built string gets passed to the proc as that

    then you can use a tally table to break it up in one simple command

    select Id = substring(@idlist, t.n * 20 - 19, t.n * 20), idx= n

    From dbo.tally where n < (len(@idlist)+19)/20

  • This CTE method is not fool proof: it doesn't work if the separator character is not in the main string:

    Msg 536, Level 16, State 5, Line 3

    Invalid length parameter passed to the SUBSTRING function.

    Wilfred
    The best things in life are the simple things

  • Hi,

    even a trailing delimiter would not work correct

    eg. select * from strtotable('1,',','),

    because then the result is 1 and 0 !

    kr/Werner

  • I simply use XML.

    Using XML whole complex structures can be passed to SQL and treated as Tables.

    You can use Functions with the XML to render these into virtual tables and treat these as regular tables!

  • Thanks for the good article, Amit. I was looking for a way this morning to find all the dates between start date and end date, and there you go, your article is here. I manage to replicate your script to find all the middle dates. Thanks..

    and thanks for other contributors for the 'MAXRECURSION' option hints..

    ALTER FUNCTION [dbo].[MiddleDatesToTable]

    (

    @StartDateDATETIME,

    @EndDateDATETIME

    )

    RETURNS

    @listTable TABLE

    (

    item DATETIME

    )

    AS

    BEGIN

    ;WITH rep (item, nextday) AS

    (

    SELECTDATEADD(DD,1,@StartDate) as 'item', DATEADD(DD,2,@StartDate) as 'nextday'

    WHEREDATEDIFF(DD, DATEADD(DD,1,@StartDate), @EndDate) > 0

    UNION ALL

    SELECT nextday as 'item', DATEADD(DD,1,nextday) as 'nextday'

    FROM rep

    WHEREDATEDIFF(dd,nextday, @EndDate) > 0

    )

    INSERT INTO @listTable

    SELECT item FROM rep option (MAXRECURSION 0)

    RETURN

    END

  • Always good to see different ways of approaching things. Will have to have a look at the issue with the single item here and then maybe set up some performance tests with the examples given.

    I also have been using the tally table version.

    CREATE FUNCTION dbo.ProcessStringArray (@inputString NVARCHAR(max), @separator NCHAR(1))

    RETURNS @output TABLE(arrayItem NVARCHAR(4000))

    AS

    BEGIN

    /**

    * Add start and end separators to the Parameter so we can handle single elements

    **/

    SET@inputString = @separator + @inputString + @separator

    INSERT@output

    /**

    * Join the Tally table to the string at the character level and when we find a separator

    * insert what's between that separator and the next one

    **/

    SELECTSUBSTRING(@inputString,N+1,CHARINDEX(@separator,@inputString,N+1)-N-1)

    FROMdbo.Tally

    WHEREN < LEN(@inputString)

    ANDSUBSTRING(@inputString,N,1) = @separator

    RETURN

    END

    Where N is the integer column of the tally table.

    Pretty certain I found this based on another article here somewhere but cannot find it to give credit.

    [font="Arial"]_______________________________________________________
    Change is inevitable... Except from a vending machine.[/font]

  • I've used a function similar to MiddleDatestoTable in my apps, however I followed a different approach.

    Create Function dbo.udf_NumbersTable (

    @riStartingNumber Int, @riCount Int

    )

    Returns @tbl Table (SequenceNumber Int)

    As

    Begin

    -- Logic used from http://www.sqlservercentral.com/articles/Datetime+Manipulation/61822/

    ;With

    tblLevel0 As (Select 1 As C Union All Select 1), --2 rows

    tblLevel1 As (Select 1 As C From tblLevel0 As A, tblLevel0 As B),--4 rows

    tblLevel2 As (Select 1 As C From tblLevel1 As A, tblLevel1 As B),--16 rows

    tblLevel3 As (Select 1 As C From tblLevel2 As A, tblLevel2 As B),--256 rows

    tblLevel4 As (Select 1 As C From tblLevel3 As A, tblLevel3 As B),--65536 rows

    tblLevel5 As (Select 1 As C From tblLevel4 As A, tblLevel4 As B),--4294967296 rows

    tblSeq As (Select Row_Number() Over(Order By C) As N From tblLevel5)

    Insert Into @tbl( SequenceNumber )

    Select N + @riStartingNumber - 1 From tblSeq Where N Between 1 AND @riCount

    Return

    End

    Go

    Declare @dtStart DateTime, @dtEnd DateTime

    Declare @iDays Int

    -- Example values

    Set @dtStart = '20091201'

    Set @dtEnd = '20091225'

    -- Usage

    Set @iDays = DateDiff(d, @dtStart, @dtEnd) + 1

    Select DateAdd(d, SequenceNumber, @dtStart) As DateValue

    From dbo.udf_NumbersTable(0, @iDays)

  • ok .. replace the dates with variables and create a proc

    WITH mycte AS (SELECT cast ('2010-01-01' AS DATETIME) DateValue

    UNION ALL

    SELECT DateValue + 1

    FROM mycte

    WHERE DateValue + 1 < cast ('2011-01-01' AS DATETIME))

    SELECT

    year(datevalue) as [Year],

    month(datevalue) as [Month],

    day(datevalue) as [Day],

    datename(dw, DateValue) Weekday

    from mycte

    OPTION (MAXRECURSION 0)

  • Liked the article thanks.

    But, uh-oh, here we go again with the string-splitting debate...:laugh:

    Like it or not, the overall best method ever is to use a CLR TVF. It just is.

    http://sqlblog.com/blogs/adam_machanic/archive/2009/04/28/sqlclr-string-splitting-part-2-even-faster-even-more-scalable.aspx

    Paul

Viewing 15 posts - 1 through 15 (of 73 total)

You must be logged in to reply to this topic. Login to reply