There are a few times where you might want to perform a cross join/cartesian join/cartesian product operation in SSIS. One of those scenarios could be performing some kind of a double-sided "range join" where you can't line up a specific key in your data flow to a specific key in your lookup table. Quite often you're planning to have a Conditional Split right after your cross join operation to heuristically eliminate the rows that don't qualify for your "range" matching.
Warning - performing a cross join is an inherently expensive endeavour! It's called a cartesian product for a reason - you're multiplying rows. Every row from one source will be replicated for each and every row from the other source. If you have 100 rows from each source, your cartesian product will result in 100x100 = 10,000 rows. This can quickly get to be a rediculously large number. A source of 200,000 rows joined to a 3,000 row table - quite a reasonably small "range lookup" scenario - will result in 600,000,000 rows. Don't say you haven't been warned about performance - in terms of processing power AND memory.
The obvious component you'll try to get this functionality from is the Merge Join - but it only supports Full Outer joins, Left Outer joins, and Inner joins. There are no other components that offer join capability in SQL Server Integration Services, so how can you perform this kind of join?
Cross Join Pattern #1
I've chosen to (quite uselessly) cross join SQL Server's sysobjects and sysindexes tables.
Step 1 - Add an Artificial Key Column
Step 3 - Perform a Full Outer Join
As you can see, my queries of sysobjects and sysindexes have been cross joined to each other. Individually, they have very few rows each, but the cross join makes that row count explode.
Cross Join Pattern #2
This pattern performs a little better than #1 - but it has more restrictions on it. This technique can't be used anywhere in the data flow - it has to be used at the top. It can also only be used on sources where we can ask for artificially generated columns - so it's not going to work on text files.
We still have to do this - but we do it a different way this time. First, we have to add the join_key to the source query. We don't have to issue an ORDER BY operation in the query (not that SSIS would pay attention to it), because the join_key is a constant.Step 3 - Perform a Full Outer Join






