Importing CSV into a Database

  • Hello all.

    I am attempting to import a text file into a database that I have created.

    I have hundreds of text files - all the same format and eventually want to import them all into the database, but let's not run before I can walk.

    The first thing I am trying is simply to import a test file and already I am failing at that.

    Basically the text file in question is "payment information". At the start of the text file is a "Header" file, which I have removed by ticking the "Column names in the first data row" in my Flat File Connection Manager.

    However at the bottom of each file there is a "Trailer Record", which I don't want to import.

    How would I go about importing all the data in between the Header and the Trailer?

    An example of the Trailer is below -

    "T",2347.94,2347.94,11

    So it always has a T in the first column. Is there something in SSIS I can do to remove it?

    Thanks

  • Off the top of my head, you could import all the rows including the Trailer row (but not the Header row) into a staging table that contains all the columns you need plus an extra column that is simply an auto-incrementing IDENTITY column with a seed value of 1 and an increment of 1 Then use an Execute SQL task to hold the maximum row number in the staging table - the SQL command would be 'SELECT MAX(RowNum( FROM <staging_tablename> (assuming that you've called your identity column RowNum)and the ResultSet property would be set to 'Single row'. You could then store the result set in your variable called, say, MaxRowNumber.

    Next, create a Data Flow task containing an OLEDB Source component and an OLEDB destination compoment. Set the 'Data access mode' to 'SQL Command' and then write your query as:

    SELECT * FROM staging_tablename WHERE RowNum < ?

    Now click on the 'Parameters' button and select the User::@MaxRowNumber variable from the Variable drop-down list.

    Finally, configure the OLEDB Destination component to write rows to your intended target table.

    There may be other ways to accomplish your goal (the easiset of which would be to remove the offending trailer row from the file before it is loaded, but I'm assuming that you can't do that otherwise you'd already have tried! ;-)), but this is a starter for 10.

    Regards

    Lempster

    For some reason the < symbol is not displaying correctly.

  • Thanks for that - but ideally I don't really want to introduce another staging table.

    I'm going to start reading about the "Conditional Split" that you can use in SSIS to see if this will help. I just have no idea how to use it yet.

    Any other suggestions are welcome.

    Thanks again.

  • Yep, good idea, you could use a Conditional Split transform with a condition like 'LEFT(columnname,1) != "T"

    If you've got hundreds of thousands of rows in your text files, it might be a bit slow mind.

  • More stupid questions from me. It takes a lot of effort being this dumb.

    So I have a text file and I have worked out to strip the Trailer at the end of the file by using a conditional split. I've also worked out how to get rid of the "" that were surrounding my text as they were importing in my database as well.

    At present my database only has one filed in it called "RecordIdentifier" it's a Varchar(3) as I try and get my head around trying to fully import he text file I have.

    My Flat File connection manager editor I have renamed "Column 0" and renamed it to "RecordIdentifier" in order to match to the database field of the same name.

    Now it is a DataType of String[DT_STR] and an output column width of 5. (3 + the 2 double " marks)

    What I want to do is how do I create an ID that will be unique and auto create when I import "at present" my one column of data. I have about 80 odd fields in the text file and want each row to have a unique number.

  • Theses are the errors I receive after I run the SSIS after creating a second field in the database which is the primary key.

    The field I have created is an INT called UniqueID. I thought it would automatically create a number? It is not mapped to anything from my txt file.

    [EOD Database [2]] Error: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005.

    An OLE DB record is available. Source: "Microsoft SQL Server Native Client 11.0" Hresult: 0x80004005 Description: "The statement has been terminated.".

    An OLE DB record is available. Source: "Microsoft SQL Server Native Client 11.0" Hresult: 0x80004005 Description: "Cannot insert the value NULL into column 'UniqueID', table 'Datacube_Prog10.dbo.EOD_CAPITA_ENDOFDAY'; column does not allow nulls. INSERT fails.".

    [EOD Database [2]] Error: SSIS Error Code DTS_E_INDUCEDTRANSFORMFAILUREONERROR. The "EOD Database.Inputs[OLE DB Destination Input]" failed because error code 0xC020907B occurred, and the error row disposition on "EOD Database.Inputs[OLE DB Destination Input]" specifies failure on error. An error occurred on the specified object of the specified component. There may be error messages posted before this with more information about the failure.

    [SSIS.Pipeline] Error: SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component "EOD Database" (2) failed with error code 0xC0209029 while processing input "OLE DB Destination Input" (15). The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure.

  • Ryan Keast (1/3/2014)


    Theses are the errors I receive after I run the SSIS after creating a second field in the database which is the primary key.

    The field I have created is an INT called UniqueID. I thought it would automatically create a number? It is not mapped to anything from my txt file.

    You've added a column with a data type of INT, but it won't auto-populate - you can't do that from a flat file. If you create your destination table with an additonal column of type INT with the IDENTITY(1,1) property though, each row that you insert into the table will have a unique integer value courtesy of that column.

    Regards

    Lempster

  • Sorry I don't follow - where do I set the IDENTITY(1,1)???

  • I've just worked it out.

    Thanks for your help. I really appreciate it.

    Did - ALTER TABLE EOD_CAPITA_ENDOFDAY ADD UniqueID INT IDENTITY(1, 1)

  • Ryan Keast (1/3/2014)


    Sorry I don't follow - where do I set the IDENTITY(1,1)???

    Presumably you are importing the data for your text file into a database table, yes? If so, when you create your destination database table, you need to add an extra column in the table definition, e.g.

    CREATE TABLE [dbo].[MyTable] (

    ID int IDENTITY(1,1) NOT NULL,

    FirstColFromTextFile .......,

    SecondColFromTextFile......,

    ....

    ....

    )

    Does that help?

    Regards

    Lempster

  • Right, you got it! Our posts crossed. I didn't know whether your detination table already existed or was being created as part of the Data Flow. Looks like you're on your way now! 🙂

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply