parse comma-delimited list into seperate columns

  • Not exactly sure how to go about this. I am selecting a few fields from a table. One of the fields has values that are comma-delimited with exactly 6 values. I need to be able to split that into 6 columns.

    Example data:

    ID_1ID_2

    xyz54050, FY11, Apr, Budget, Version_1, 0160117

    abc54050, FY11, May, Budget, Version_1, 0160117

    results would be in the following columns....

    ID_1,Account,Year, Month, Scenario, Version, Entity

  • This thread may be of help.

    http://www.sqlservercentral.com/Forums/Topic830317-338-1.aspx

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • CirquedeSQLeil (1/11/2010)


    This thread may be of help.

    http://www.sqlservercentral.com/Forums/Topic830317-338-1.aspx

    That script returns the values into seperate rows. I need to return into 6 columns

  • Is there any predefined order the data have to show up in the columns?

    Based on your sample data it looks like the following would help (as a second step to continue the link Jason provided):

    replace the SELECT statement with the following CASE statement:

    SELECT

    id,

    max(CASE WHEN idx=0 THEN rght ELSE NULL END) AS Idx0,

    max(CASE WHEN idx=1 THEN rght ELSE NULL END) AS Idx1

    FROM test

    GROUP BY ID

    -- instead of

    select * from test order by id, idx



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Try this.

    -- if temp table already exists (failed previous run), drop it

    if OBJECT_ID('tempdb..#test') IS NOT NULL DROP TABLE #test

    -- simulate the table with the data in it.

    -- NOTE how your sample data was put into a table

    -- to make it easier for us volunteers to work with it!

    DECLARE @test-2 TABLE (RowData varchar(75))

    INSERT INTO @test-2

    SELECT 'xyz 54050, FY11, Apr, Budget, Version_1, 0160117' UNION ALL

    SELECT 'abc 54050, FY11, May, Budget, Version_1, 0160117'

    -- get the data from the table and put it into a temporary work table

    SELECT RowData

    INTO #TEST

    FROM @test-2

    -- add some columns to hold the comma positions

    ALTER TABLE #TEST

    ADD Col1EndPos int,

    Col2EndPos int,

    Col3EndPos int,

    Col4EndPos int,

    Col5EndPos int

    -- need some variables to hold the comma positions for each row

    DECLARE @Col1EndPos int,

    @Col2EndPos int,

    @Col3EndPos int,

    @Col4EndPos int,

    @Col5EndPos int

    -- update the columns to hold the comma positions

    UPDATE #Test

    SET @Col1EndPos = Col1EndPos = CharIndex(',', RowData),

    @Col2EndPos = Col2EndPos = CharIndex(',', RowData, @Col1EndPos + 1),

    @Col3EndPos = Col3EndPos = CharIndex(',', RowData, @Col2EndPos + 1),

    @Col4EndPos = Col4EndPos = CharIndex(',', RowData, @Col3EndPos + 1),

    @Col5EndPos = Col5EndPos = CharIndex(',', RowData, @Col4EndPos + 1)

    -- now, get the data for each column

    SELECT [ID] = ROW_NUMBER() OVER (ORDER BY RowData),

    [Account] = LEFT(RowData, Col1EndPos-1),

    [Year] = SUBSTRING(RowData, Col1EndPos+1, Col2EndPos-Col1EndPos-1),

    [Month] = SUBSTRING(RowData, Col2EndPos+1, Col3EndPos-Col2EndPos-1),

    [Scenario] = SUBSTRING(RowData, Col3EndPos+1, Col4EndPos-Col3EndPos-1),

    [Version] = SUBSTRING(RowData, Col4EndPos+1, Col5EndPos-Col4EndPos-1),

    [Entity] = SUBSTRING(RowData, Col5EndPos+1, len(RowData)-Col5EndPos-1)

    FROM #Test

    -- clean up

    if OBJECT_ID('tempdb..#test') IS NOT NULL DROP TABLE #test

    For a thorough description of how to use this form of the update statement, as well as ALL of the rules for utilizing it, read this article[/url].

    Edit: Replaced variables with column names in the select statement.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • That is a nice script Wayne. I wonder how it scales with thousands of records...

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • Well, the concept comes from Jeff. 'nuf said. 😉

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS (1/11/2010)


    Well, the concept comes from Jeff. 'nuf said. 😉

    Well, the CrossTab solution I used is based on one of Jeff's articles as well...

    Let's see, which one is going to perform better.

    No matter what, Jeff's solution (one or the other) is going to win again (most probably...). 😀



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • WayneS (1/11/2010)


    Well, the concept comes from Jeff. 'nuf said. 😉

    Heh, started testing it after my last post. Still need to look at exec plans. From 2 records growing to 100,000 records it took 6 seconds (without the table variable piece - i dumped the 100,000 records into a table so they would be present prior to execution of the query and not be calculated in the stats and execution plan blah blah blah).

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • lmu92 (1/11/2010)


    WayneS (1/11/2010)


    Well, the concept comes from Jeff. 'nuf said. 😉

    Well, the CrossTab solution I used is based on one of Jeff's articles as well...

    Let's see, which one is going to perform better.

    No matter what, Jeff's solution (one or the other) is going to win again (most probably...). 😀

    Need to add both to my arsenal so they are readily available 😀

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • CirquedeSQLeil (1/11/2010)


    WayneS (1/11/2010)


    Well, the concept comes from Jeff. 'nuf said. 😉

    Heh, started testing it after my last post. Still need to look at exec plans. From 2 records growing to 100,000 records it took 6 seconds (without the table variable piece - i dumped the 100,000 records into a table so they would be present prior to execution of the query and not be calculated in the stats and execution plan blah blah blah).

    I assumed that the table variable piece would be dropped, since the OP did mention that the stuff was in a table already.

    Also, although not given in the data, I assume that the row_number() part will be dropped also... the ID should be in the table also. This will probably speed it up a little bit also. Does your test table utilize an identity column? That would duplicate it pretty well.

    The execution plan should be short-n-sweet. A scan of the temp table for the update, followed by a scan of it for selecting the result set. (And the original query to get the data from the table into the temp table.)

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • lmu92 (1/11/2010)


    Is there any predefined order the data have to show up in the columns?

    Based on your sample data it looks like the following would help (as a second step to continue the link Jason provided):

    replace the SELECT statement with the following CASE statement:

    SELECT

    id,

    max(CASE WHEN idx=0 THEN rght ELSE NULL END) AS Idx0,

    max(CASE WHEN idx=1 THEN rght ELSE NULL END) AS Idx1

    FROM test

    GROUP BY ID

    -- instead of

    select * from test order by id, idx

    update to this one

    When I tested it the results were a bit unexpected. I have updated it to show expected results.

    SELECT

    id,

    max(CASE WHEN idx=0 THEN lft ELSE NULL END) AS Idx0

    ,max(CASE WHEN idx=1 THEN lft ELSE NULL END) AS Idx1

    ,max(CASE WHEN idx=2 THEN lft ELSE NULL END) AS Idx2

    ,max(CASE WHEN idx=3 THEN lft ELSE NULL END) AS Idx3

    ,max(CASE WHEN idx=4 THEN lft ELSE NULL END) AS Idx4

    ,max(CASE WHEN idx=5 THEN lft ELSE NULL END) AS Idx5

    FROM test

    GROUP BY ID

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • lmu92 (1/11/2010)


    WayneS (1/11/2010)


    Well, the concept comes from Jeff. 'nuf said. 😉

    Well, the CrossTab solution I used is based on one of Jeff's articles as well...

    Let's see, which one is going to perform better.

    No matter what, Jeff's solution (one or the other) is going to win again (most probably...). 😀

    I have tested both scripts and compared results.

    I would go with the script provided by Wayne. When the result sets are small, then both are similar. When dealing with 100,000 records or so - the script by Wayne outperforms the first script substantially.

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • thanks everyone. this is really good stuff 🙂

  • Whoops incorrect forum

    If everything seems to be going well, you have obviously overlooked something.

    Ron

    Please help us, help you -before posting a question please read[/url]
    Before posting a performance problem please read[/url]

Viewing 15 posts - 1 through 15 (of 40 total)

You must be logged in to reply to this topic. Login to reply