Today, we continue on with the 31 Days of SSIS blog series. Yesterday’s post was on the use of expressions with Precedence Constraints and how to control the Control Flow. Today, we’ll venture a little further in with the use of expressions, and a head into the Data Flow and how you can use expressions within that.
As you may or may not be aware, most tasks and connections within SSIS can be manipulated to some extent with the use of an Expressions collection. Expressions collections allow for properties, such as Connection Strings and the Disabled property, to be assigned at run-time. The values for these properties can be determined with the SSIS expression language that I previously blogged a bit about.
I mentioned that the Expressions collection was available for tasks and connections. Unfortunately, that I know of, there are no Expressions collections available on any transformations. Personally, I always found this to be a failing, since I often wanted to configure transformations at run-time.
One good example of this is the Lookup transformation. With this transformation, you are able to choose a table, or view as the source. You may also choose to write a query as the source for the lookup.
The two things you cannot do that you will want to do, is either add parameters to the query or use a variable to set the query at run-time. Both of those options are available to OLE DB Sources from SQL Server, but not for the Lookup transformation.
There are a number of situations where I have run across this need. Specifically, when I am using multi-tenant databases the domain of the lookup should be restricted to a single client or customer within the database. To solve this, I will sometimes utilize the Merge Join solution I talked about a while back. There is a different solution that you can go with, that doesn’t require more and different components.
The different solution is to leverage the Expressions collection on the Data Flow task. If you haven’t noticed, when some transformations are added to the Data Flow, there are new properties added to the Data Flow task. These new properties will be exposed in the Expressions collection.
As an example, if you add a Lookup transformation to a Data Flow task, there will be two new properties added to the task. The properties are [SqlCommand] and [SqlCommandParam]. These properties are tied to the query that is added to the Lookup transformation. Modify these parameters, and you will modify the data set that is used by the Lookup transformation.
Rather than talking in platitudes, let’s look at how to set this up in a package. To follow along, download the package at the end of the post. I’m going to assume that you have that package, and will not be talking about setting up the components outside of the Lookup Transformation changes.
Let’s begin. To start with, the SSIS package contains a Lookup transformation named LKUP_Address. Currently, the transformation is using the following query to lookup addresses:
SELECT pa.AddressID ,pa.AddressLine1 ,pa.AddressLine2 ,pa.City ,psp.Name AS StateProvince ,pa.PostalCode FROM Person.Address pa INNER JOIN Person.StateProvince psp ON pa.StateProvinceID = psp.StateProvinceID WHERE psp.Name = 'California'
But when we execute the package, it needs to use the following query:
SELECT pa.AddressID ,pa.AddressLine1 ,pa.AddressLine2 ,pa.City ,psp.Name AS StateProvince ,pa.PostalCode FROM Person.Address pa INNER JOIN Person.StateProvince psp ON pa.StateProvinceID = psp.StateProvinceID WHERE psp.Name = 'Washington'
While this is a contrived example compared to what would be used in a real environment, it does demonstrate how this functionality can be used.
Now that we know what we need to do, let’s move forward with the changes that are required. If you take a look at the Lookup transformation, you’ll see that the transformation is populated with the first query from above. If we run with this query there will be 725 rows sent to the Lookup Match Output.
We need to change this, so that it will execute with the state of Washington instead. As I mentioned above, there are two properties added to the Data Flow task. To get to changes that we require, we need to change the value for the [SqlCommand] property.
On the properties for the Data Flow task, click on the ellipsis at the end of the line for the Expressions property. This will open the Property Expression Editor. Select the property [LKUP_Address].[SqlCommand] from drop down. Then set the expression to:
"SELECT pa.AddressID ,pa.AddressLine1 ,pa.AddressLine2 ,pa.City ,psp.Name AS StateProvince ,pa.PostalCode FROM Person.Address pa INNER JOIN Person.StateProvince psp ON pa.StateProvinceID = psp.StateProvinceID WHERE psp.Name = 'Washington'"
When that is done, execute the package again, and the count for the Lookup Match Output will change from 725 to 379. It probably shouldn’t seem this simple to solve to change the query for the Lookup transformation at run-time. The truth be told, it is.
Using the Expression collection is a great way to extend the flexibility of the Data Flow task directly, and your SSIS package indirectly. Most transformations offer a property or two to the data flow, that can be accessed through the Expression collection. While the example was on the Lookup transformation, the Derived Columns, Conditional Split and others also allow for run-time property changes.
By using these new ways to improve performance, by limiting lookup sets (Lookup transformation) or changing how data flows through the package (Conditional Split transformation) are made available. Take the time to look into these, and make sure to leverage them when applicable.
The funny thing about Expression collections at the Data Flow task level is that up until last year, I hadn’t even been aware of them. The are easy to miss if you haven’t taken the time to do a little exploring. Take a hint from here to look at these and dig a little further, you might find out that something that gets you “every time” is there but just not where you thought it would be.