Split string using Tally Table

  • jcrawf02 (7/2/2010)


    Hugo, thanks again for the excellent re-write and explanation.

    Not sure why this one is tripping folks up, if you just count the delimiters, you can see it will return 13 pieces?

    ...anyway...good question, I'm sure it's introduced some folks to this idea that haven't seen it before.

    I assumed the extra crazy in string handling was doing something to consume the double/adjacent delimiter, else this was "simply" a string split using Tally. The question was so easy I assumed it was a trick. 🙁

  • Hello all and thanks for the time spend on answering my posted question.

    Originally i came across with a similar t-sql already posted by Hugo. That's definitely the simplest and quickest way of doing split string using tally. Thanks Hugo.

    Nevertheless i wanted to make sure i could get the same result by using some string functions.

    Sorry about the messy code... I should have submitted correctly formatted. :blush: Consider it as an extra level of difficulty. 😉

    Anyway, as almost of you got it, the catch was to look to where clause...

    My major concern was to post an example of using tally. Back a few months ago i read some articles discussing tally and how to replace cursors and while loops with it and become a huge fan since then. 😀

    I know it isn't the best example, but got you guys thinking on it. 😛

    Best Regards,

    PM

  • For a SQL Server 2000 (and prior) dinosaur like myself, the whole concept of defining a Common Table Expresssion (CTE), never mind a recursive one, was new to me.

    If anyone else is in the same boat, I found the following article very helpful:

    http://www.4guysfromrolla.com/webtech/071906-1.shtml

  • Thanks Hogo for explanation.

    I was completely lost in question and does not even know what to answer. A brain teaser with small query is always good but with this big query that to in a image file is a brain torture.

    Well, I got right but just by fluke. No efforts on trying to find, why I was right.;-)

    SQL DBA.

  • BWAA-HAA!!!! First, this is NOT a good example of Tally Table code. It doesn't use a Tally Table and it doesn't use anything that could be called efficient. Anyone who uses a recursive CTE to generate Tally numbers just doesn't know what hidden RBAR is. 😛

    I agree with what someone else has stated... this code should be used as an example of how NOT to write code and how NOT to accomplish a split. Same goes for any example in this thread that uses a recursive CTE to do the split. 😉

    I hope no one actually copies the code to use as a split function but I will say it was very interesting in how many "extra" things were added to the code in an attempt at obfuscation. Heh... maybe it should be titled "job security methods". :hehe:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Here is what may be considered a better example of using a Tally table. It is dynamically created with the function. And FYI, it is formatted the way I like to format my code. I've heard some complain that it makes my code look more complex.

    USE [SandBox]

    GO

    /****** Object: UserDefinedFunction [dbo].[DelimitedSplit] Script Date: 07/02/2010 17:28:34 ******/

    SET ANSI_NULLS ON

    GO

    SET QUOTED_IDENTIFIER ON

    GO

    CREATE function [dbo].[DelimitedSplit] (

    @pString varchar(max),

    @pDelimiter char(1)

    )

    returns table

    as

    return

    with

    a1 as (select 1 as N union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1 union all

    select 1),

    a2 as (select

    1 as N

    from

    a1 as a

    cross join a1 as b),

    a3 as (select

    1 as N

    from

    a2 as a

    cross join a2 as b),

    --a4 as (select

    -- 1 as N

    -- from

    -- a3 as a

    -- cross join a2 as b),

    Tally as (select top (len(@pString))

    row_number() over (order by a.N) as N

    from

    a3 as a

    cross join a2 as b),

    ItemSplit(

    ItemOrder,

    Item

    ) as (

    SELECT

    N,

    SUBSTRING(@pDelimiter + @pString + @pDelimiter,N + 1,CHARINDEX(@pDelimiter,@pDelimiter + @pString + @pDelimiter,N + 1) - N - 1)

    FROM

    Tally

    WHERE

    N < LEN(@pDelimiter + @pString + @pDelimiter)

    AND SUBSTRING(@pDelimiter + @pString + @pDelimiter,N,1) = @pDelimiter --Notice how we find the delimiter

    )

    select

    row_number() over (order by ItemOrder) as ItemID,

    Item

    from

    ItemSplit

    GO

    declare @Text varchar(max),

    @StringDelimiter char(1);

    SET @Text = 'This T-SQL will split these sentences into rows.' +

    'How many rows will be returned?.' +

    'M.a.y.b.e..n.o.n.e.?';

    SET @StringDelimiter = '.';

    select * from dbo.DelimitedSplit (@Text, @StringDelimiter);

  • Can someone explain me how the script given below is working ?

    Especially the CASE Statement

    DECLARE @Text NVARCHAR(2000)

    DECLARE @StringDelimiter CHAR(1)

    SELECT @Text = 'This T-sql will split senteneces into rows.'+

    'How many rows will be returned?.'+

    'M.a.y.b.e..n.n.o.e.?',

    @StringDelimiter = '.';

    With Tally (Number)

    AS

    (

    SELECT ROW_NUMBER() OVER( ORDER BY (SELECT NULL) )AS Number

    UNION ALL

    SELECT Number + 1 AS Number

    FROM Tally WHERE Number <= LEN(@Text)

    )

    SELECT CASE WHEN RIGHT(LEFT(@Text,Number),

    CASE WHEN CHARINDEX(@StringDelimiter,REVERSE(LEFT(@Text,Number -1)),0) > 0

    THEN CHARINDEX(@StringDelimiter,REVERSE(LEFT(@Text,Number -1)),0) - 1

    ELSE CHARINDEX(@StringDelimiter,Reverse(LEFT(@Text,Number -1)),0) END) = ''

    AND

    CHARINDEX(@StringDelimiter,REVERSE(LEFT(@Text,Number -1))) = 0

    THEN LEFT(@Text,Number -1)

    ELSE RIGHT(LEFT(@Text,Number - 1),

    CASE WHEN CHARINDEX(@StringDelimiter,Reverse(LEFT(@Text,Number -1)),0) >0

    THEN CHARINDEX(@StringDelimiter,REVERSE(LEFT(@Text,Number -1)),0) -1

    ELSE CHARINDEX(@StringDelimiter,REVERSE(LEFT(@Text,Number -1)),0) END)

    END AS SPLIT FROM TALLY

    WHERE(NCHAR(UNICODE(SUBSTRING(@Text,Number,1))) = @StringDelimiter

    ORNumber - 1 = LEN(@Text))

    OPTION (MAXRECURSION 32767)

  • ricky70rana (7/2/2010)


    Can someone explain me how the script given below is working ?

    Especially the CASE Statement

    DECLARE @Text NVARCHAR(2000)

    DECLARE @StringDelimiter CHAR(1)

    SELECT @Text = 'This T-sql will split senteneces into rows.'+

    'How many rows will be returned?.'+

    'M.a.y.b.e..n.n.o.e.?',

    @StringDelimiter = '.';

    With Tally (Number)

    AS

    (

    SELECT ROW_NUMBER() OVER( ORDER BY (SELECT NULL) )AS Number

    UNION ALL

    SELECT Number + 1 AS Number

    FROM Tally WHERE Number <= LEN(@Text)

    )

    SELECT CASE WHEN RIGHT(LEFT(@Text,Number),

    CASE WHEN CHARINDEX(@StringDelimiter,REVERSE(LEFT(@Text,Number -1)),0) > 0

    THEN CHARINDEX(@StringDelimiter,REVERSE(LEFT(@Text,Number -1)),0) - 1

    ELSE CHARINDEX(@StringDelimiter,Reverse(LEFT(@Text,Number -1)),0) END) = ''

    AND

    CHARINDEX(@StringDelimiter,REVERSE(LEFT(@Text,Number -1))) = 0

    THEN LEFT(@Text,Number -1)

    ELSE RIGHT(LEFT(@Text,Number - 1),

    CASE WHEN CHARINDEX(@StringDelimiter,Reverse(LEFT(@Text,Number -1)),0) >0

    THEN CHARINDEX(@StringDelimiter,REVERSE(LEFT(@Text,Number -1)),0) -1

    ELSE CHARINDEX(@StringDelimiter,REVERSE(LEFT(@Text,Number -1)),0) END)

    END AS SPLIT FROM TALLY

    WHERE(NCHAR(UNICODE(SUBSTRING(@Text,Number,1))) = @StringDelimiter

    ORNumber - 1 = LEN(@Text))

    OPTION (MAXRECURSION 32767)

    I realize you would like to understand what this code is doing, but if you are looking for a delimited split function, I think you will find the code I posted previously a much simplier routine to understand. Please note, however, that there are other routines out there tat may perform better as well. I know mine starts having some performance issues when you start working with character strings that are > 8000 bytes.

  • Actually you are right that i want to understand how that code is working.Only was able to understand Common Table Expression out of this script. Actually I want to understand how the CASE Statement is working. Do you have an idea about that?

    I already gone through with your script and that is less cumbersome than this one and easy to understand. Your post is good help to understand tally table and delimiters.

  • ricky70rana (7/2/2010)


    Actually you are right that i want to understand how that code is working.Only was able to understand Common Table Expression out of this script. Actually I want to understand how the CASE Statement is working. Do you have an idea about that?

    I already gone through with your script and that is less cumbersome than this one and easy to understand. Your post is good help to understand tally table and delimiters.

    Without copying the code, reformating it to a more readable style for myself, then working through it, no I really don't. It is overly obfusicated when there are much easier ways of accomplishing the necessary task of a delimited split.

    If you really want to understand the code, take the time to reformat the code so that you can understand the flow, then work through it like a computer would.

  • Very Critical 😀

  • ricky70rana (7/2/2010)


    Can someone explain me how the script given below is working ?

    Especially the CASE Statement

    The CASE expression (sorry about that, couldn't resist) -or rather the nested CASE expressions- are, as already mentioned, overly complicated.

    Basically, it makes use of LEFT(@Text, Number) to get the first Number characters of the string, then uses CASE to decide if there is a @StringDelimiter somewhere in that last part - if there is, it takes the rightmost bit until the last @SDtringDelimiter, otherwise it takes the whole part.

    The nested CASE are required to prevent out-of-bound errors in the SUBSTRING functions.


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • I have to admit ... I was looking at this and saying "What the heck???"

    For all those interested, the latest version of the "DelimitedSplit8k" function can be found here.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • jts_2003 (7/2/2010)


    I think this is an example of why people don't like to use OVER, since it's so hard to work out what might/will be returned!

    I'd like to see a simpler question or articles on how OVER works - any takers?

    It's not that hard at all, once you understand it. Did you see this article[/url] that was just recently published in April?

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Like someone else who replied, I shortcircuited the whole thing & counted the delimiters (well, more specifically the spaces between the delimiters), assuming it was an obfuscation question not a "Haha! Tricked you!" question.

    Took one skim of the code under the Tally CTE and my brain went "bleh". Seen much cleaner string-splitters.

    An exercise in picking through nested functions & conditional statements if nothing else.



    Scott Duncan

    MARCUS. Why dost thou laugh? It fits not with this hour.
    TITUS. Why, I have not another tear to shed;
    --Titus Andronicus, William Shakespeare


Viewing 15 posts - 16 through 30 (of 41 total)

You must be logged in to reply to this topic. Login to reply