Updating tables in SSIS

  • Please help. I'm new to SSIS and need a suggestion on which way to proceed to resolve the following problem. I can lookup the documentation and implement it, just need to be pointed in the right direction within the SSIS framework.

    I'm importing user data from a flat file into a staging table. From the staging table I need to update or insert data in the users table, depending on whether or not the user already exists. At the time I update/insert into the users table, I need to perform some complex business logic on a few of the columns I'm updating, based on values in the users table.

    I'm familiar with sql and stored procs. I just don't understand how this would best be implemented in SSIS. I understand how to INSERT data with SSIS, source --> transformation --> destination, but I need to undertand how one would best UPDATE existing tables while joining to other tables and performing logic against the data based on values in said joined tables.

    Should I cram all this logic in a stored proc and just call it from SSIS, or should I use the script component and attempt to do this through script, or am I missing an obvious tool?

  • There is a SQL command object for doing updates within your data flow.

    One way of doing what you are talking about is to use a lookup component to determine if a record exists, in the lookup failure you will get missing records - put your insert here, and in the lookup success you get existing records - put your update here.

    In addition to that, you can use a merge join to determine if records exist by using a left join in the merge join and then using a conditional split after it looking for NULL's. The "found null" side of the split will be your inserts and the other side will be your updates.

    Finally, there is a "Slowly Changing Dimension" data flow component that will bring up a wizard that basically asks you for the fields you want to join on, which fields to update, and how to update them. When the wizard is done it generates the destination and update components (and some others depending on your selections). These components can then be modified to suit your other needs.

    I would start with trying the SCD wizard component so you can see what it builds. This will give you some sample components for a simple update and give you some ideas on what to do. The SCD component is really slow for large data sets, the lookup component will be fastest with small data sets, and the merge join will be fastest with large data sets (because of some sorting requirements). Finally, you will find that this type of operation is often fastest done in T-SQL.

  • Exactly the type of feedback I needed. Thank you for taking the time and for the thorough response!

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply