Importing data from txt file and updating Table

  • Hi all,

    I'm a relative newbie here.

    We receive a text file every month from our supplier which is to be used to update a table we have in SQL server 7 (1.7 million records).

    The txt file contains only additional records and modifications and is circa 500K records.

    What's the best way to do this??

  • Hi djpeake,

    quote:


    We receive a text file every month from our supplier which is to be used to update a table we have in SQL server 7 (1.7 million records).

    The txt file contains only additional records and modifications and is circa 500K records.

    What's the best way to do this??


    only importing 'new' data?

    What about creating a DTS package?

    You can use the DTS Import/Export wizard and in one of the last steps save it as a package.

    Cheers,

    Frank

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • One way is to import the file into a temporary table and then process that temporary table to insert or update records in the master table.

    Depending on how complex the data is, you might get away with a simple insert statement:

    insert <master_table>

    select <col1>, <col2>

    from <temporary_table>

    where not exists (select * from <master_table> where <temporary_table>.<Primary_Key> = <master_table>.<primary_key> )

    ie insert a row where it does not exist already.

    Updates would be update row where exists already.

    If there is more processing to do on each row, such as field validation or inserts into other tables then things get a bit more complicated but the approach is still valid.

    There are several ways of controlling this process. One way is using DTS (personally I have never got to grips with it). Another way is using TSQL (CREATE temp table, BULK INSERT update file into temp table, INSERT new rows, UPDATE existing rows, DROP table).

    Either way, you can put it into a job and then schedule it to run as required.

    Jeremy

  • quote:


    only importing 'new' data?

    What about creating a DTS package?

    You can use the DTS Import/Export wizard and in one of the last steps save it as a package.

    Cheers,

    Frank


    i don't know what you mean by 'new' - do you mean new records?? if so DTS is cool for that, however i've tried it using the file i've got and it doesn't seem to work. I get an error saying that a row would cause a duplicate index value.

  • quote:


    i don't know what you mean by 'new' - do you mean new records??


    sorry, must have read additional data.

    quote:


    if so DTS is cool for that, however i've tried it using the file i've got and it doesn't seem to work. I get an error saying that a row would cause a duplicate index value.


    seems like you're trying to import the PrimaryKey? or something with a unique constraint.

    Try to import everything but that.

    If it is more complex, it would be helpful to post file and/or table structure.

    Cheers,

    Frank

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • A couple of extra points.

    1. Do the updates before the inserts otherwise you end up updating the rows you have just inserted.

    2. If you have 500K rows to process and a table of 1.7M, then you will probably have locking issues. One way round this would be to add an identity column to the load table and then process that in batches of 1000.

    Jeremy

  • quote:


    insert <master_table>

    select <col1>, <col2>

    from <temporary_table>

    where not exists (select * from <master_table> where <temporary_table>.<Primary_Key> = <master_table>.<primary_key> )

    ie insert a row where it does not exist already.

    Updates would be update row where exists already.

    Jeremy


    Insert works great, thx. I know this sounds lame but how would i do the update? (The records & column names are the same in both tables)

    Cheers

    David

  • Something like this (Check it out in BOL):

    update <master_Table>

    set col1 = <temporary_table>.col1, col2 = <temporary_table>.col2 ....

    from <master_table>, <temporary_table>

    where <master_table>.<primary_key> = <temporary_table>.<primary_key>

    I don't think you acutally need the exists statement as the join has the same effect - i.e. selects existing rows.

    Jeremy

  • quote:


    Something like this (Check it out in BOL):

    update <master_Table>

    set col1 = <temporary_table>.col1, col2 = <temporary_table>.col2 ....

    from <master_table>, <temporary_table>

    where <master_table>.<primary_key> = <temporary_table>.<primary_key>

    I don't think you acutally need the exists statement as the join has the same effect - i.e. selects existing rows.


    Cheers

    David

  • Was hoping there was some form of shorthand way of doing it. There are 236 fields!

    Never mind though, many thanks for your help!!!

  • If it is a simple replace all the content of the row with the new content, then you might consider deleting the row and then inserting the row. Performance might take a hit but it would save your fingers.

    Jeremy

  • bit late now, my finger's are sore and i'm ready for home 🙂

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply