SSIS to import and update data if exist on the destination

  • Hi Anshul,

    Yes, the conditional split is updated..please see the output on my attachment

    NewRow ISNULL(DB_HSBC_TRADE_REF)

    ExisitngRow !ISNULL(DB_HSBC_TRADE_REF)

  • So, what is the error you are getting now?

    ____________________________________________________________

    AP
  • It seems it does not know which one is existing and which one is new. it keeps on importing to NewRow which it should not be because those data are already in the destination table.

  • Jim5130 (5/28/2015)


    Hi Anshul,

    Yes, the conditional split is updated..please see the output on my attachment

    NewRow ISNULL(DB_HSBC_TRADE_REF)

    ExisitngRow !ISNULL(DB_HSBC_TRADE_REF)

    Now, the mappings are right.

    But I can see your merge join isn't working. As you're getting 1977 rows from your merger join i.e.1644+333 rather than only 333.

    Your ID column in the flat file isn't matching with the ID column of your Db.

    I would suggest adding a data conversion before sorting and convert ID column from flat file to the same data type and size as of the ID column in your DB. This can be a cause of join not working properly.

    ____________________________________________________________

    AP
  • Can you share your sample package?

    ____________________________________________________________

    AP
  • Hi Anshul,

    Attached. Please rename it to DTSX

  • Jim5130 (5/28/2015)


    Hi Anshul,

    Attached. Please rename it to DTSX

    It seems some issue with the sorting part. Try executing the attached package.

    Also, is it possible to share some sample data as well? So that I can try executing the package as well.

    ____________________________________________________________

    AP
  • Hi Anshul,

    Thanks but still the same. Attached is some dummy file.

    Edited. After testing the dummy file I attached. It imported to the destination DB ( 3 entries). After I re-run again, it does not go to Existing row.. and still goes to newrow.

  • Jim5130 (5/28/2015)


    Thanks but still the same. Attached is some dummy file.

    It is not the same, Merge is working fine now.

    You are using varchar columns as join keys. Add a derived column transformation before the sort and use LTRIM(RTRIM(<<column name>>)) to trim your data.

    There can be an issue the id's are not matching because of some extra spaces.

    Your package is fine now, just tweak a little and it will work fine.

    ____________________________________________________________

    AP
  • My Data type on source table is "char" and on my flat file it is String [DT_STR], would make some issue?

    Any other procedure for Insert/Update? how about the Merge command?

  • Hi,

    I've not noticed anyone else suggest this alternative, so here goes:

    Use cache files to slim down your data.

    Try to avoid putting data into staging tables if at all possible. It is costly in i/o and creates extra steps the data has to pass through. That said in some circumstances it is unavoidable. The T-SQL MERGE is also not that efficient so you need to do this only on the data you need. Ideally avoid it if you can as MERGE has to do a full table scan of the target table during operation.

    Before loading data, create a cache file based upon the natural key of the target table. This is what defines a unique record based upon your business rules. Include the surrogate key also if available. Now within your data flow now refer to this cache file with a Lookup task. Rows which match are updates, rows which don't are inserts. Direct your non-matches (inserts) directly into the target table. Direct your matches into a staging table. Create a stored procedure to batch update your target table from this staging table (I do batches of 1000 rows at a time). And now that you know that all the records in your staging table are updates, you can avoid using the MERGE statement and just use UPDATE.

    This gets you closer to performing as much computation in memory (RAM) as possible, which is much more efficient than storing and calculating with data on disk.

  • Hi Anshul,

    To update, I tried to add a derive column and use LTRIM(RTRIM(HSBC_TRADE_REF)) but it is still the same. The Conditional split is sending the newrow for all data entries from flat file. Any other idea?

    *** Update, I've added the Derived Column on flat file and OLE DB source and it works. I can procedd with the flow now.

    My other question is what if the entry has the same unique entries on 1 column but different on other column, is it possible import it as well?

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply