TSQL: How to transform first row as column name?

  • Lidou123

    Hall of Fame

    Points: 3052

    Hello,

    I hope you are all well.

    I would like your help on a data transformation task that I have.

    I would like to convert the first row of a table to a column name

    I am working on SQL Server Azure and I get daily data from another service.This service loads a table that is of the same form: TableA

    I would like to transform the data in the same manner:

    TableB

    Do You have any idea how to do it ?

    Thank you for your help

     

     

  • Phil Parkin

    SSC Guru

    Points: 244578

    Do the column names change from day to day?

    If they do, do you want to create a new table every day?

    Or is the target table static in structure such that you just want to ignore row 1 and split rows 2 and beyond before loading them?

    If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
    See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

  • Thom A

    SSC Guru

    Points: 98644

    Your data looks like a odd mix of an excel file and a CSV. If you were sticking to one technology this would actually be easier. Phil raises some good questions here though. I'd hate for this to have to go down the dynamic-SQL route.

    Also, cross post on Stack Overflow: https://stackoverflow.com/questions/55958126/how-to-transform-first-row-as-column-name

     

     

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 719742

    Also, how do you get this data?

  • Lidou123

    Hall of Fame

    Points: 3052

    Hello Phil

    Thanks for your answers.

    - We have 50 tables that could be refreshed. Depending of the day, we receive data for a specific destination table.

    So column name could change from day to day.

    • This reply was modified 1 year, 2 months ago by  Lidou123.
  • Lidou123

    Hall of Fame

    Points: 3052

    Hello,

    Data are in a sql Server table.

    I used excel just for illustrate my issue.

  • Lidou123

    Hall of Fame

    Points: 3052

    Data are loaded every night by an ETL from another service.

  • Thom A

    SSC Guru

    Points: 98644

    Lidou123 wrote:

    Hello, Data are in a sql Server table. I used excel just for illustrate my issue.

    So the data is stored in a delimited format in your Table? This sounds like the problem is your ETL process and you need to fix that. The ETL process should be creating the appropriate table, with the right column names, and storing the right values in the right column. Not storing colon delimited data in the table without the correct column headings.

    You need to share the details of your ETL package and then we can explain how to fix it.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.

  • Phil Parkin

    SSC Guru

    Points: 244578

    Lidou123 wrote:

    Hello Phil Thanks for your answers. - We have 50 tables that could be refreshed. Depending of the day, we receive data for a specific destination table. So column name could change from day to day.

    So am I right in thinking that you determine which table is the target based on the column names? That is, you have some sort of mapping rules?

    If the answer to your question can be found with a brief Google search, please perform the search yourself, rather than expecting one of the SSC members to do it for you.
    See https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help/ for details of how to post T-SQL code-related questions.

  • Steve Jones - SSC Editor

    SSC Guru

    Points: 719742

    Also, does the row number get populated? Without that, you can't tell where the columns are. Rows have no natural order.

    This can probably be done in T-SQL, though how do you know what the target table is? It would be helpful if you mocked your table in SQL Server with some data, then show how you want results, with table and column names.

  • Lidou123

    Hall of Fame

    Points: 3052

    Hello Phil.

    Thank you for your answers.

    I will be able to clarify my request.

    The rules and process for the loading datadata like that:

    - we receive an email from another service that warns us that they will refresh the data from table A.

    Table A has 3 columns.

    Column1 -> Row Num

    Column2-> Destination Table (Table B)

    Column3 -> data (with first is column name)

    The first row is the name of column and then we have data associated to these columns.

    The first row is like the mapping schema for my destination table.

    Separator field is always the same (semicolumn).

    - Then I need to load data from Table A to table B. Table B is already created and is static ( columns don't change).

    I am trying to develop a Dynamic query with a parameter table for the insert script. But I don't know if It's the best solution.

    Thank you for your help.

     

     

     

  • Thom A

    SSC Guru

    Points: 98644

    So, assuming I understand correctly, your data looks like this:

    CREATE TABLE dbo.ETLData (ROW_NUM int,
    Line varchar(255)); --Guessed data type

    INSERT INTO dbo.ETLData (ROW_NUM,
    Line)
    VALUES(1,'ColumnAName;ColumnBName'),
    (2,'ValueA1;ValueB1'),
    (3,'ValueA2;ValueB2'),
    (4,'ValueA3;ValueB3'),
    (5,'ValueA4;ValueB4');

    Couple of methods here involve Dynamic SQL. The first uses PARSENAME, assumes that each column will have a string length of 128 characters or less, and that there are only 2 (although you can expand this method up to 4). It will not scale automatically and means your data can't include a period(.) either:

    DECLARE @SQL nvarchar(MAX);

    SELECT @SQL = N'SELECT ROW_NUMBER() OVER (ORDER BY ED.ROW_NUM) AS RN,' + NCHAR(13) + NCHAR(10) +
    N' D.PN2 AS ' + QUOTENAME(C.PN2) + N',' + NCHAR(13) + NCHAR(10) +
    N' D.PN1 AS ' + QUOTENAME(C.PN1) + N'' + NCHAR(13) + NCHAR(10) +
    N'FROM ETLData ED' + NCHAR(13) + NCHAR(10) +
    N' CROSS APPLY(VALUES(PARSENAME(REPLACE(Line,'';'',''.''),1),PARSENAME(REPLACE(Line,'';'',''.''),2))) D(PN1,PN2)' + NCHAR(13) + NCHAR(10) +
    N'WHERE ED.ROW_NUM > 1;'
    FROM dbo.ETLData ED
    CROSS APPLY(VALUES(PARSENAME(REPLACE(Line,';','.'),1),PARSENAME(REPLACE(Line,';','.'),2))) C(PN1,PN2)
    WHERE ED.ROW_NUM =1 ;

    PRINT @SQL;

    EXEC sp_executesql @SQL;

    If you need a solution that automatically handles varying data (different column amounts) or more than 4, you need to be a bit more "clever". This uses Jeff Moden and Eirikur Eiriksson's work on delimitedsplit8k_lead. I'm not using STRING_SPLIT, as it doesn't provide ordinal position, and so can't be relied on. Note that if Line can have more than 8,000 characters (or 4,000 if you need to use N4K version) then this will truncate the value. Hopefully that isn't the case (if it it, I refer to my closing comment).

    DECLARE @SQL nvarchar(MAX);

    SELECT @SQL = N'SELECT ROW_NUMBER() OVER (ORDER BY ED.Line) AS RN,' + NCHAR(13) + NCHAR(10) +
    STUFF((SELECT ',' + NCHAR(13) + NCHAR(10) +
    N' MAX(CASE WHEN DS.ItemNumber = ' + CONVERT(varchar(4),DS.ItemNumber) + N' THEN DS.Item END) AS ' + QUOTENAME(DS.Item)
    FROM dbo.ETLData ED
    CROSS APPLY dbo.DelimitedSplit8K_lead(ED.Line,';') DS
    WHERE ED.ROW_NUM = 1
    ORDER BY DS.ItemNumber
    FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,3,N'') + NCHAR(13) + NCHAR(10) +
    N'FROM ETLData ED' + NCHAR(13) + NCHAR(10) +
    N' CROSS APPLY dbo.DelimitedSplit8K_lead(ED.line,'';'') DS' + NCHAR(13) + NCHAR(10) +
    N'WHERE ED.ROW_NUM > 1' + NCHAR(13) + NCHAR(10) +
    N'GROUP BY ED.Line' + NCHAR(13) + NCHAR(10) +
    N'ORDER BY ED.Line;';

    PRINT @SQL;
    EXEC sp_executesql @SQL;

    If you need to, you can make either of these dynamic statements into an INSERT statement; the PRINT statement will help you debug the queries.

    Honestly, I still think the better place to do this is change the ETL process to properly import the data though, and not importing delimited data. Unfortunately without knowing the process you're using, it'll be very hard to tell you how to fix it.

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.

  • Lidou123

    Hall of Fame

    Points: 3052

    Thank you Thom A for your answer

    I will test it in this morning

     

  • Lidou123

    Hall of Fame

    Points: 3052

    Hello Steve,

     

    This is a screenshot of my Source Table.

    TableSource

     

    And this a Screenshot of Table(Ods.Products)  that I want to populate.

    TableDestination

    Thank you for your help

     

    • This reply was modified 1 year, 2 months ago by  Lidou123.
  • Thom A

    SSC Guru

    Points: 98644

    Ahh, if that's the format and requirement, then this is a more "complete" solution (assumes that the ID column will always be called Id) that inserts the data into the appropriate table defined by the column sas_table:

    CREATE TABLE dbo.ETLData (ROW_NUM int,
    Sas_table sysname,
    Line varchar(255)); --Guessed data type

    INSERT INTO dbo.ETLData (ROW_NUM,
    Sas_table,
    Line)
    VALUES(1,'dbo.SampleTable','ColumnAName;ColumnBName'),
    (2,'dbo.SampleTable','ValueA1;ValueB1'),
    (3,'dbo.SampleTable','ValueA2;ValueB2'),
    (4,'dbo.SampleTable','ValueA3;ValueB3'),
    (5,'dbo.SampleTable','ValueA4;ValueB4');

    CREATE TABLE dbo.SampleTable (ID int,
    ColumnAName varchar(50),
    ColumnBName varchar(50));
    GO

    DECLARE @SQL nvarchar(MAX);



    SELECT @SQL = N'INSERT INTO ' + CONCAT_WS('.',QUOTENAME(PARSENAME(ED.Sas_table,4)),QUOTENAME(PARSENAME(ED.Sas_table,3)),QUOTENAME(PARSENAME(ED.Sas_table,2)),QUOTENAME(PARSENAME(ED.Sas_table,1))) + NCHAR(13) + NCHAR(10) +
    N' ([Id],' + STUFF((SELECT ',' + QUOTENAME(DS.Item)
    FROM dbo.DelimitedSplit8K_LEAD(ED.Line,';') DS
    WHERE ED.ROW_NUM = 1
    ORDER BY DS.ItemNumber
    FOR XML PATH(''),TYPE).value('.','nvarchar(MAX)'),1,1,N'') + ')' + NCHAR(13) + NCHAR(10) +
    N'SELECT ROW_NUMBER() OVER (ORDER BY ED.Line) AS RN,' + NCHAR(13) + NCHAR(10) +
    STUFF((SELECT ',' + NCHAR(13) + NCHAR(10) +
    N' MAX(CASE WHEN DS.ItemNumber = ' + CONVERT(varchar(4),DS.ItemNumber) + N' THEN DS.Item END) AS ' + QUOTENAME(DS.Item)
    FROM dbo.ETLData ED
    CROSS APPLY dbo.DelimitedSplit8K_LEAD(ED.Line,';') DS
    WHERE ED.ROW_NUM = 1
    ORDER BY DS.ItemNumber
    FOR XML PATH(N''),TYPE).value('.','nvarchar(MAX)'),1,3,N'') + NCHAR(13) + NCHAR(10) +
    N'FROM ETLData ED' + NCHAR(13) + NCHAR(10) +
    N' CROSS APPLY dbo.DelimitedSplit8K_LEAD(ED.line,'';'') DS' + NCHAR(13) + NCHAR(10) +
    N'WHERE ED.ROW_NUM > 1' + NCHAR(13) + NCHAR(10) +
    N'GROUP BY ED.Line' + NCHAR(13) + NCHAR(10) +
    N'ORDER BY ED.Line;'
    FROM dbo.ETLData ED
    WHERE ED.ROW_NUM = 1;

    PRINT @SQL;
    EXEC sp_executesql @SQL;
    GO

    SELECT *
    FROM dbo.SampleTable;

    GO
    DROP TABLE dbo.ETLData;
    DROP TABLE dbo.SampleTable;

    Considering you're using SQL Server 2017, you could use STRING_AGG, but I don't get to use my 2017 Sandbox often, and not everyone does have access to the function; hence I tend to stick to FOR XML PATH and STUFF. This would more succinct using STRING_AGG and you might as well use the functionality you've got considering your on  the latest version (I did remember to use CONCAT_WS at least, which was a nice addition in 2017).

    • This reply was modified 1 year, 2 months ago by  Thom A. Reason: Cause if my indents aren't right it annoys me. :)

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply