Load CSV file with columns in different order

  • Hi,
    We have different clients all load data using CSV file, and each client has different header columns need to load them in destination.

    Scenario 1: 
    Client:1 
    Col1,Col2,Col3.....Col50
    1,'abcasda','asjdlkasd',........'aasdasdasd'  
    2,'fassdaf','fsdfsdf',........'asfdasd'  
    3,'asasd','eterte',........'werwerwe'  

    Client:2 
    Col2,Col3,Col1.....Col50
    'abcasda','asjdlkasd',4,........'aasdasdasd'  
    'fassdaf','fsdfsdf',5,........'asfdasd'  
    'asasd','eterte',6,........'werwerwe'  

    End goal is to read those csv file and insert them in destination table.

    Scenario 2: 

    Client2 has extra column , were as i am interested only till Col50

    Client:2 
    Col1,Col2,Col3.....Col50
    1,'abcasda','asjdlkasd',........'aasdasdasd'  
    2,'fassdaf','fsdfsdf',........'asfdasd'  
    3,'asasd','eterte',........'werwerwe'  

    Client:2 
    Col2,Col3,Col1.....Col62
    'abcasda','asjdlkasd',4,........'aasdasdasd'  
    'fassdaf','fsdfsdf',5,........'asfdasd'  
    'asasd','eterte',6,........'werwerwe'  

    How can this be achieved with or without script task. 
    Any suggestions on the above.

  • This may give you an idea to get started if you don’t have too many unique file layouts to deal with.  I had a similar situation where there were three slightly different flat file layouts that needed to be imported into a single table.  The differences in the layouts included different numbers of columns, and in some cases, different headings on columns that had the same meaning.

    There is a flat file connection manager for each of the three layouts, and a fourth flat file connection manager for a “normalized” layout that includes the columns needed to update the table.

    Three corresponding data flows, one for each of the three layouts, reads its input file and outputs the normalized file, mapping or ignoring input columns as appropriate to the input file type.

    The technique depends on being able to determine the input file type during execution, in this case, the file type could be determined by parsing the file name in a script task, which sets a package variable indicating the file type.  Three precedence constraints coming out of the script task use expressions that check the file type variable and direct control to the appropriate data flow.

    After the normalized flat file has been created, it feeds into the package’s main data flow, which imports it into the table.

  • You could create a script source component to handle multiple formats.

    In the PrimeOutput method, open the file and read the first row with the column headers.  Deduce from the headers which format you're dealing with.

    Define multiple blocks of "output column value = input column value" statements, one for each format, in the CreateNewOutputRows method.  Use the format chosen earlier to pick which block to use.

    The output columns of this script source should then have the correct values no matter what format was being read.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply