SQL for normalizing repeating groups in a text file

  • I would like to know how to create the SQL for normalizing a repeating group text file. An example of the data below.

    ProductNbr, Date, Red, Blue, Green, Yellow

    A102 02-01-2010, 10 15 0 5

    Needed Result of data inserted into a table.

    ProductNbr, Date, Color Units

    A102 02-01-2010 Red 10

    A102 02-01-2010 Blue 15

    A102 02-01-2010 Green 0

    A102 02-01-2010 Red 10

    A102 02-01-2010 Yellow 5

  • Using this approach, you will need to construct a format file based on your text file. This method sets up a cte to read your text file with OPENROWSET BULK and then normalizes the data using CROSS APPLY.

    WITH cte AS

    (

    SELECT * FROM OPENROWSET (BULK 'C:\Sample.txt', FORMATFILE = 'C:\SampleFormat.fmt', FIRSTROW = 2) AS X

    )

    SELECT cte.ProductNbr, cte.Date, Z.Color, Z.Units

    FROM cte

    CROSS APPLY

    (

    SELECT 'Red', Red UNION ALL

    SELECT 'Blue', Blue UNION ALL

    SELECT 'Green', Green UNION ALL

    SELECT 'Yellow', Yellow

    ) AS Z (Color, Units)

  • You could also use UNPIVOT once you have the data in a staging table.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have never heard of or used a UNPIVOT function. Can you please detail how it is used in a SQL statement.

    Thanks

    Mare

  • Thanks for your info...but if I have the data in a table, (which would be the initial process,) would I only have to use the following part of the statement:

    SELECT ProdColortbl.ProductNbr, ProdColortbl.Date, Z.Color, Z.Units

    FROM ProdColortbl

    CROSS APPLY

    (

    SELECT 'Red', Red UNION ALL

    SELECT 'Blue', Blue UNION ALL

    SELECT 'Green', Green UNION ALL

    SELECT 'Yellow', Yellow

    ) AS Z (Color, Units)

    Post #973090

    Post #973090

  • Try it. Does it give you the output you are looking for?

  • I created a table in Access (don't have access to SQL server today) with the repeating fields of colors, then ran sql in my ETL tool and I got this error.

    1. ERROR

    DS-DBMS-E400: UDA driver reported the following on connection 'DS_Sales':

    UDA-SQL-0043 The underlying database detected an error during processing the SQL request.

    [Microsoft][ODBC Microsoft Access Driver] Syntax error in FROM clause.

    SELECT

    Products.ProductNbr, Products.Date, Z.Color, Z.Units

    FROM Products

    CROSS APPLY

    (

    SELECT 'Red', Red UNION ALL

    SELECT 'Blue', Blue UNION ALL

    SELECT 'Green', Green UNION ALL

    SELECT 'Yellow', Yellow

    ) AS Z (Color, Units)

  • marerose888 (8/23/2010)


    I created a table in Access (don't have access to SQL server today) with the repeating fields of colors, then ran sql in my ETL tool and I got this error.

    1. ERROR

    DS-DBMS-E400: UDA driver reported the following on connection 'DS_Sales':

    UDA-SQL-0043 The underlying database detected an error during processing the SQL request.

    [Microsoft][ODBC Microsoft Access Driver] Syntax error in FROM clause.

    SELECT

    Products.ProductNbr, Products.Date, Z.Color, Z.Units

    FROM Products

    CROSS APPLY

    (

    SELECT 'Red', Red UNION ALL

    SELECT 'Blue', Blue UNION ALL

    SELECT 'Green', Green UNION ALL

    SELECT 'Yellow', Yellow

    ) AS Z (Color, Units)

    I'm pretty sure that ACCESS doesn't have a CROSS APPLY in it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If you are restricted to the Access environment, you could use a SQL Pass-Through Query

  • If you're doing this in Access, just use the crosstab query wizard... 🙂

  • Ok tried that...didn't work. you can't flip the repeating group fields in to one field...unless I am missing something. Here is the table data with the repeating groups. You want to pivot the colors and metrics in the "color" field.

    ProductNbrDateColorRedBlueGreenYellow

    1234 2/10/2010152030

    1245 3/15/2010210314

    1645 7/20/20105151418

    1704 6/15/20101215

  • OK so here it is, an ACCESS UNPIVOT query. You will need to create a table called Tally with a single field N, Data Type Number. Simply enter into this Tally table the values 1, 2, 3, 4. Then try the query below.

    SELECT ProductNbr, [Date],

    iif(N = 1, 'Red', iif(N = 2, 'Blue', iif(N = 3, 'Green', iif(N = 4, 'Yellow', '')))) AS Color,

    iif(N = 1, Red, iif(N = 2, Blue, iif(N = 3, Green, iif(N = 4, Yellow, '')))) AS Units

    FROM Products, Tally

    ORDER BY ProductNbr;

  • BTW did you try running the below as a SQL PASS-THROUGH query as I suggested previously? That should work fine too;-)

    SELECT ProdColortbl.ProductNbr, ProdColortbl.Date, Z.Color, Z.Units

    FROM ProdColortbl

    CROSS APPLY

    (

    SELECT 'Red', Red UNION ALL

    SELECT 'Blue', Blue UNION ALL

    SELECT 'Green', Green UNION ALL

    SELECT 'Yellow', Yellow

    ) AS Z (Color, Units)

  • Wow it worked...although I know an initial process would have to be set up first to build the "Tally" table and populate it with all the repeating groups. Now that could be interesting...to do that and not know if the repeating groups are static or not. Anyway, I will try to figure that out....unless of course you know how.

    Anyway, now to further expose my ignorance of this function, how in the second set of iif statements ( iif(N = 1, Red, iif(N = 2, Blue, iif(N = 3, Green, iif(N = 4, Yellow, '')))) AS Units ( does the logic recognize that if the N = 1 and color is Red that the Units grab the units and the column color be populated with the right amount. I will understand if this is hard to describe because I can't see it visually;. But I am determined to learn all I can about this. I will be doing a lot of pivoting ETL for a client that is having to deal with flat files that don't adhere to the 3rd normal form.

    I have don't this will a very sophisticated ETL tool but I want to learn it from the basic code level. But unfortunately I am only able to use MS Access.

    Thanks a bunch.

    Mare

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply