• Great comments!

    Rainer: Deleted rows can be correlated with a Right Join instead of a Left Join, and Filtered the same as New Rows. For the Delete action I stage the rows (much like the Updates) and run a set-based query after the Data Flow.

    I do believe loading incrementally is faster than than truncating and reloading in any database that is scaling, or is of sufficient size and experiencing a certain threshold of changes (this is my way of quoting Andy Warren: "It Depends" ;)).

    Surely what you are talking us through here is the same as is accomplished by the Slowly changing dimension 'box' .

    Yussuf: Yep - it's a lot of the same stuff accomplished with the SCD transformation (optimized some). Unless you're familiar with the concepts of ETL for Kimball-based databases, the SCD Wizard can be a bit intimidating. You essentially work through the same thought process you would work through here, but it looks different to the first-time developer. I would say the converse is more accurate: "the SCD is a very cool wrapper for incremental loads."

    ...using a select statement would be faster than using a view.

    Arthur: Gosh I hope I didn't say that. If I did, forgive me. I argue against selecting a Table or View by name from the dropdown in the same way I argue against writing T-SQL with "Select *..." at the beginning. Extra lookups take place for column names and the like when you execute "Select *". The same applies in SSIS which is, after all, executing SQL against the database.

    :{> Andy

    Andy Leonard, Chief Data Engineer, Enterprise Data & Analytics