How to split Delimited text

  • MaverickMan

    SSC Eights!

    Points: 925

    Here is the format of data we have currently I am looking to split the text into various columns like how you can do it in excel:

    Column

    Home/Documents/Text/Files/

    Home/Documents/Text

    Home/Documents/Text/Files/XYZ.txt

    Home/Documents/Text/Files/

    Home/Documents/

    I would like to select the column in this format:

    Column A Column B Column C Column D

    Home/ Documents/ Text/ Files/

    Home/ Documents/ Text/

    Home/ Documents/ Text/ Files/

    Home/ Documents/ Text/ Files/

  • Sean Lange

    SSC Guru

    Points: 286536

    That is kind of nasty. You first have to split your string, then you have a dynamic cross tab.

    This really should be done in the front end. If there is no other way you can do this but it is not a simple task. Take a look at the string splitter link in my signature for the first part.

    Then take a look at Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs[/url] for the dynamic cross tab portion.

    If you run into trouble post back with ddl, sample data and the desired output and we can tackle it.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • The Dixie Flatline

    SSC Guru

    Points: 53253

    Its not that nasty, it's a fairly straightforward parse, BUT....

    Column

    Home/Documents/Text/Files/

    Home/Documents/Text

    Home/Documents/Text/Files/XYZ.txt

    Home/Documents/Text/Files/

    Home/Documents/

    There is no "Text" in the last line of your input, yet you want it to appear in your output.

    It "Text" a default value for Column C?

    Are you SURE your inputs and outputs above are exactly like you want them?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Jeff Moden

    SSC Guru

    Points: 996622

    MaverickMan (4/27/2012)


    Here is the format of data we have currently I am looking to split the text into various columns like how you can do it in excel:

    Column

    Home/Documents/Text/Files/

    Home/Documents/Text

    Home/Documents/Text/Files/XYZ.txt

    Home/Documents/Text/Files/

    Home/Documents/

    I would like to select the column in this format:

    Column A Column B Column C Column D

    Home/ Documents/ Text/ Files/

    Home/ Documents/ Text/

    Home/ Documents/ Text/ Files/

    Home/ Documents/ Text/ Files/

    For better answer quicker in the future, please see the article at the first link in my signature line below.

    If you want an almost infinite number of path levels, we'll need to change the code but this will solve the problem you have posted.

    --===== Create the test tabler and populate it.

    DECLARE @SomePath TABLE

    (SomePath VARCHAR(512))

    ;

    INSERT INTO @SomePath

    SELECT 'Home/Documents/Text/Files/' UNION ALL

    SELECT 'Home/Documents/Text' UNION ALL

    SELECT 'Home/Documents/Text/Files/XYZ.txt' UNION ALL

    SELECT 'Home/Documents/Text/Files/' UNION ALL

    SELECT 'Home/Documents/'

    ;

    --====== Solve the problem

    WITH

    cteEnumerate AS

    ( --=== Create a RowNum reference that we can reassemble rows by

    SELECT RowNum = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)),

    SomePath

    FROM @SomePath

    ) --=== Split the rows and reassemble them using a classic "CROSS TAB".

    -- Get the "split" function from the "Resources" section of the following article.

    -- http://www.sqlservercentral.com/articles/Tally+Table/72993/

    SELECT [Column A] = MAX(CASE WHEN split.ItemNumber = 1 THEN split.Item + '/' ELSE '' END),

    [Column B] = MAX(CASE WHEN split.ItemNumber = 2 THEN split.Item + '/' ELSE '' END),

    [Column C] = MAX(CASE WHEN split.ItemNumber = 3 THEN split.Item + '/' ELSE '' END),

    [Column D] = MAX(CASE WHEN split.ItemNumber = 4 THEN split.Item + '/' ELSE '' END)

    FROM cteEnumerate e

    CROSS APPLY dbo.DelimitedSplit8K(SomePath,'/') split

    WHERE split.Item > ' '

    GROUP BY e.RowNum

    HAVING MAX(CASE WHEN split.ItemNumber = 3 THEN split.Item + '/' ELSE '' END) > ' '

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • vikingDBA

    SSCertifiable

    Points: 5356

    Just did this last week for our developer. Is a table-valued function that returns a table with the values split out. Hope this helps.

    (probably not very scalable, but we don't have that much in the way of transactions per hour).

    First value of the function is the string to split. The second input is the ASCII value of the delimiter (ex: 44 for the comma).

    /* ================================================================================ */

    /* Split out line according to specific value as delimiter */

    /* Table-valued Function */

    /* ================================================================================ */

    /*

    Example: (these lines go together)

    PRINT @@version

    DECLARE @wow varchar(8000)

    SET @wow = ''

    SELECT @wow = @wow + ValLine + '^' FROM dbo.fn_tv_Split_Values(@@version,10) ORDER BY RecID

    SELECT @wow

    ==========================================================================

    --another example

    SELECT ValLine FROM dbo.fn_tv_Split_Values(@@version,10) ORDER BY RecID

    */

    CREATE FUNCTION dbo.fn_tv_Split_Values(@lvaltouse varchar(8000),@valtospliton int)

    RETURNS @table TABLE

    (RecID int IDENTITY PRIMARY KEY NOT NULL,

    ValLine varchar(8000))

    AS

    BEGIN

    DECLARE @lvallength int

    DECLARE @lvalct int

    DECLARE @spotcat int

    DECLARE @lval1 varchar(8000)

    DECLARE @spotcatval int

    SET @lvallength = LEN(@lvaltouse)

    SET @lvalct = 1

    SET @spotcat = 1

    SET @lval1 = ''

    WHILE @spotcat <= @lvallength

    BEGIN

    SET @spotcatval = ASCII(SUBSTRING(@lvaltouse,@spotcat,1))

    if @spotcatval = @valtospliton -- value we are looking for

    BEGIN

    INSERT INTO @table VALUES(@lval1)

    SET @lval1 = ''

    END

    else -- add to current value line

    BEGIN

    --if @spotcatval <> 9 -- other values we are wanting to exclude

    -- BEGIN

    SET @lval1 = @lval1 + CHAR(@spotcatval)

    -- END

    END

    SET @spotcat = @spotcat + 1

    END

    if @lval1 <> '' AND (@lval1 IS NOT NULL)

    INSERT INTO @table VALUES(@lval1)

    RETURN;

    END

  • Sean Lange

    SSC Guru

    Points: 286536

    vikingDBA (4/30/2012)


    Just did this last week for our developer. Is a table-valued function that returns a table with the values split out. Hope this helps.

    (probably not very scalable, but we don't have that much in the way of transactions per hour).

    First value of the function is the string to split. The second input is the ASCII value of the delimiter (ex: 44 for the comma).

    Take a close look at the link in my signature about splitting strings. This is the article Jeff mentioned above. This is a set based splitter and is WAY faster than any loop based splitter. His article explains how it works and how to use it. He also goes into great detail about performance of the different types of splitters.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • MaverickMan

    SSC Eights!

    Points: 925

    Thank you all for providing such wide variety of solutions to this problem. Performance is not the need for our problem so it will work with any of the solutions presented above.

  • Sean Lange

    SSC Guru

    Points: 286536

    MaverickMan (4/30/2012)


    Thank you all for providing such wide variety of solutions to this problem. Performance is not the need for our problem so it will work with any of the solutions presented above.

    I would argue that performance should ALWAYS be a concern. If you use the looping method and performance is acceptable for the data today, in the future you will use the same looping construct on a large dataset and it will take forever.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Lynn Pettis

    SSC Guru

    Points: 442332

    Sean Lange (4/30/2012)


    MaverickMan (4/30/2012)


    Thank you all for providing such wide variety of solutions to this problem. Performance is not the need for our problem so it will work with any of the solutions presented above.

    I would argue that performance should ALWAYS be a concern. If you use the looping method and performance is acceptable for the data today, in the future you will use the same looping construct on a large dataset and it will take forever.

    I have to agree 100%.

  • Jeff Moden

    SSC Guru

    Points: 996622

    MaverickMan (4/30/2012)


    Performance is not the need for our problem so it will work with any of the solutions presented above.

    BWAAA-HAAAA!!!! So! Did you pick the slowest solution? 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Dwain Camps

    SSC Guru

    Points: 86893

    Jeff Moden (5/12/2012)


    MaverickMan (4/30/2012)


    Performance is not the need for our problem so it will work with any of the solutions presented above.

    BWAAA-HAAAA!!!! So! Did you pick the slowest solution? 😀

    If performance is not an issue, maybe he'll pick mine (handles up to 10 levels of directory and splits the filename out to a separate column):

    DECLARE @SomePath TABLE

    (SomePath VARCHAR(512))

    INSERT INTO @SomePath

    SELECT 'Home/Documents/Text/Files/' UNION ALL

    SELECT 'Home/Documents/Text' UNION ALL

    SELECT 'Home/Documents/Text/Files/XYZ.txt' UNION ALL

    SELECT 'Home/Documents/Text/Files/' UNION ALL

    SELECT 'Home/Documents/' UNION ALL

    SELECT 'Batch/Documents/' UNION ALL

    SELECT 'XYZ.txt' UNION ALL

    SELECT 'Batch/Documents/' UNION ALL

    SELECT 'Home/Documents/Text/Files/'

    ;WITH CTE AS (

    SELECT SUBSTRING(SomePath, 1, CHARINDEX('/', SomePath + '/')) As First

    ,SUBSTRING(SomePath, 1 + CHARINDEX('/', SomePath + '/'), LEN(SomePath)) As Rest

    ,ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) As n

    ,1 As m

    ,CASE WHEN CHARINDEX('.', SUBSTRING(SomePath, 1, CHARINDEX('/', SomePath + '/'))) <> 0 THEN 1 ELSE 0 END AS fn

    FROM @SomePath

    UNION ALL

    SELECT SUBSTRING(rest, 1, CHARINDEX('/', rest + '/')) As First

    ,SUBSTRING(rest, 1 + CHARINDEX('/', rest + '/'), LEN(rest)) As Rest

    ,n

    ,m + 1

    ,CASE WHEN CHARINDEX('.', SUBSTRING(rest, 1, CHARINDEX('/', rest + '/'))) <> 0 THEN 1 ELSE 0 END AS fn

    FROM CTE WHERE LEN(Rest) > 0

    ),

    CTE2 AS (

    SELECT CASE WHEN SUBSTRING(REVERSE(First),1,1) <> '/' and fn = 0

    THEN First+'/'

    WHEN fn = 1 THEN NULL

    ELSE First END AS Dir

    ,m, n, CASE WHEN fn=1 THEN First ELSE NULL END AS fn2

    FROM CTE

    )

    SELECT MAX([1]) As [1], MAX([2]) AS [2], MAX([3]) AS [3], MAX([4]) AS [4]

    ,MAX([6]) As [6], MAX([7]) AS [8], MAX([9]) AS [9], MAX([10]) AS [10]

    ,MAX(y.fn2) AS [fn]

    FROM CTE2

    PIVOT (MAX(Dir) for m IN ([1], [2], [3], [4], [5], [6], [7], [8], [9], [10])) y

    GROUP BY y.n

    😛


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Jeff Moden

    SSC Guru

    Points: 996622

    Sean Lange (4/30/2012)


    MaverickMan (4/30/2012)


    Thank you all for providing such wide variety of solutions to this problem. Performance is not the need for our problem so it will work with any of the solutions presented above.

    I would argue that performance should ALWAYS be a concern. If you use the looping method and performance is acceptable for the data today, in the future you will use the same looping construct on a large dataset and it will take forever.

    I'm actually amazed that people would even admit to saying something like "performance is not an issue". All else being equal, does anyone actually ever select the slowest one if performance is "not an issue"? 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Dugi

    SSCoach

    Points: 17998

    Jeff, you are amazing how you are playing with T-SQL!

    In every code that you have published for any solution, I have learned something new!

    Thank you sir!

    Dugi

    ============================================================
    SELECT YOUR PROBLEM FROM SSC.com WHERE PROBLEM DESCRIPTION =
    http://www.sqlservercentral.com/articles/Best+Practices/61537/[/url]

  • Jeff Moden

    SSC Guru

    Points: 996622

    Gosh... Thanks Dugi. :blush: Mighty kind of you to say so.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "If "pre-optimization" is the root of all evil, then what does the resulting no optimization lead to?"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply