SSIS merge join vs. insert/delete

  • I have two tables in separate databases on separate servers.  I have to pull data from one table filtered by the other table.  I could use SSIS merge join or I could pull all the data (INSERT within the data flow component) and then filter the data based on the other table (DELETE via the SQL task).  The result is the same.

    I am curious in general in the differences of the two approaches.  I offered no details because I am not concerned about my specific case as it works adequately either way.  I'd just like to understand the differences better.  My main concern is that it feels inefficient to effectively join across servers via the SSIS merge join.

    Thanks.

  • dgl - Tuesday, May 15, 2018 7:18 AM

    I have two tables in separate databases on separate servers.  I have to pull data from one table filtered by the other table.  I could use SSIS merge join or I could pull all the data (INSERT within the data flow component) and then filter the data based on the other table (DELETE via the SQL task).  The result is the same.

    I am curious in general in the differences of the two approaches.  I offered no details because I am not concerned about my specific case as it works adequately either way.  I'd just like to understand the differences better.  My main concern is that it feels inefficient to effectively join across servers via the SSIS merge join.

    Thanks.

    Merge join in SSIS performs badly at scale and is to be avoided if possible.
    The INSERT/DELETE option is also to be avoided, unless the number of rows being deleted is small.
    Instead, I'd move the data to a (truncated and appropriately indexed) staging table on the target server and then use that to do your filtered INSERT.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • "Instead, I'd move the data to a (truncated and appropriately indexed) staging table on the target server and then use that to do your filtered INSERT."

    This is what I was trying to describe as my second option.  That is what I chose to do.  I just wondered if I was missing something positive by not using the merge join.

    Thanks!

  • dgl - Tuesday, May 15, 2018 7:55 AM

    This is what I was trying to describe as my second option. 

    Your second option clearly mentions DELETEs. My suggestion requires none, other than the initial TRUNCATE of the staging table.

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply