Merging two flat files row by row

  • I have two separate processes that produce two independent CSV files to be imported into a SQL Server table. The first CSV file is a transaction list that contains all but two columns of data for the SQL table. The second CSV file contains the remaining two columns of data and is a one-to-one row match for the first CSV file.

    Right now, I have to open the first CSV file in Excel. Then open the second file, sort it by row id (yeah, it has a row id column; it comes to me out of order, but the rows directly relate to the first CSV file), and copy/paste the two columns I need into the first Excel worksheet. This completed sheet is then imported into Access where a DTS job takes the data and loads it into SQL Server.

    Seems to me I could eliminate a couple of manual steps if I could get SSIS to merge the two CSV files and load the results directly into SQL Server.

    My problem (at least for now) is I can't figure out how to get the two CSV files merged into one, on a row-by-row basis.

    Thoughts?

    Thanks,


    --Mitch

  • for this i would use some staging tables in SQL server, use SSIS to create the tables, load your two sheets into the staging tables, then join using a SQL Join.

    You can then use SSIS to truncate/delete the tables afterwards..

  • There is an SSIS Data Flow Transformation called a Merge Join, which takes as its inputs (sorted) data from two datasets and merges them in the way you want.

    The staging table approach mentioned above will also work fine, of course and may be faster if you have lots of data.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Thanks Phil. I looked into that operation, too. The "catch" is that the one table is already in order that matches the "row number" column of the second table, but itself has no sortable keys that would preserve the order. 🙁

    It's really a right-down-the-file, row by row merge, without any sorting or such, that I'm looking for.

    -- Mitch


    --Mitch

  • I made this work by creating a script transformation component and adding an extra 'RowNum' column as an output.

    Within the script component, after defining the new output column, you just need some very simple code, eg:

    Public Class ScriptMain

    Inherits UserComponent

    Dim Count As Integer = 1

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

    With Row

    .RowNum = Count

    Count = Count + 1

    End With

    End Sub

    End Class

    And then you can use your new RowNum column in the sort and it all works ... maybe 🙂

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply