Parsing tab selimited text into SQL table

  • Hello! I have a weekly tab delimited file that is currently being parsed row by row in .net code and then inserted into a new table. The table created has 503 columns with varying row counts from week to week. The current process takes way to long because each row is parsed one by one.

    I'm looking desperately for a quicker way to accomplish this process.

    Ingredients:

    1. Tab delimited text file

    2. MS SQL Server database

    3. Program is run .Net

    Has anyone run across a similar situation or can anyone suggest a time efficient way to get this done?

    Thanks in advance for any help!

  • How about creating a batch that executes a BCP (Bulk Copy Program) script?

    _________________________________
    seth delconte
    http://sqlkeys.com

  • OR execute BCP right from a scheduled SQL Agent job. Here's a similar script I've used for a CSV file (you can change it for tab delimited use):

    declare @cmd varchar(1000)

    set @cmd =

    'bcp AdventureWorks.dbo.mytest in "C:\test.csv" -t "," -r -c -q -S MYSERVER -T'

    exec xp_cmdshell @cmd

    _________________________________
    seth delconte
    http://sqlkeys.com

  • If you use SQL Agent, there is no need to introduce an additional programming domain into the call stack. Ditch xp_CmdShell and call bcp from a step with type CmdExec.

    SSIS would be an equally capable tool to use here as well.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (3/13/2012)


    If you use SQL Agent, there is no need to introduce an additional programming domain into the call stack. Ditch xp_CmdShell and call bcp from a step with type CmdExec.

    SSIS would be an equally capable tool to use here as well.

    Nice, good call.

    _________________________________
    seth delconte
    http://sqlkeys.com

  • opc.three (3/13/2012)


    If you use SQL Agent, there is no need to introduce an additional programming domain into the call stack. Ditch xp_CmdShell and call bcp from a step with type CmdExec.

    SSIS would be an equally capable tool to use here as well.

    Ditch SSIS for an equally capable tool for this problem. Just use BULK INSERT and be done with it. 🙂

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (3/13/2012)


    opc.three (3/13/2012)


    If you use SQL Agent, there is no need to introduce an additional programming domain into the call stack. Ditch xp_CmdShell and call bcp from a step with type CmdExec.

    SSIS would be an equally capable tool to use here as well.

    Ditch SSIS for an equally capable tool for this problem. Just use BULK INSERT and be done with it. 🙂

    The "Fast Load Option" of the "OLE DB Destination" in SSIS implements BULK INSERT, same API.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (3/13/2012)


    Jeff Moden (3/13/2012)


    opc.three (3/13/2012)


    If you use SQL Agent, there is no need to introduce an additional programming domain into the call stack. Ditch xp_CmdShell and call bcp from a step with type CmdExec.

    SSIS would be an equally capable tool to use here as well.

    Ditch SSIS for an equally capable tool for this problem. Just use BULK INSERT and be done with it. 🙂

    The "Fast Load Option" of the "OLE DB Destination" in SSIS implements BULK INSERT, same API.

    Of course it is. But you don't need to go anywhere near an SSIS installation with BULK INSERT. It can all be done in T-SQL along with the rest of the processing.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Nary is the day when I am in an environment without an SSIS installation for use with ETL processing.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • If you need to do this from a .net application, load your data from the file into a DataTable and then use the SQLBulkCopy class (http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx) to move the data to your database.

  • Lisa Cherney (3/14/2012)


    If you need to do this from a .net application, load your data from the file into a DataTable and then use the SQLBulkCopy class (http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx) to move the data to your database.

    Caveat to this approach: make sure the memory on your server is ample, the generic implementation of this technique requires that the entire data file be loaded into memory before it is bulk copied to the server.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Ummmm... that class has a batch size parameter to it. I'm not a C# programmer but wouldn't that prevent the whole 9 yards from being loaded at once?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The batch size comes into play when loading the data into the database which is downstream from the issue I am highlighting.

    The SqlBulkCopy.WriteToServer method (4 overloads) is how the class is told to begin loading the data to the database. The most common of the three overloads is the one that accepts an ADO.NET DataTable, the WriteToServer(DataTable) method, and that DataTable by definition is a memory-resident object containing the data to load, the entire file in the most generic case. That is the most common way to leverage the SqlBulkCopy class and how the overwhelming majority of the basic internet tutorials show it.

    The most interesting of the three overloads to me is the one that accepts an IDataReader. The example on MSDN alludes to the original intent of this overload which was to enable table-to-table data copying by passing in a SQL ExecuteReader (executes a SELECT from one database to load into another), a very powerful and handy amethod indeed. With only a few lines of code we can copy all data from all tables in one database to a database with the same schema on another instance (database copy wizard anyone?).

    That said, I would find it much more interesting if one could easily pass an IDataReader hooked up to a flat-file reader. Then you would have something analogous to a subset of the bcp.exe functionality written in .NET, but I have not tried it nor looked to see if anyone has gone there mainly because we have SSIS, bcp.exe, BULK INSERT and many other tools already capable of doing the task and I have never been boxed into an environment where I would need such a thing.

    edit: excuse me, there are 4 WriteToServer overloads, not 3

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks, Orlando.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply