Printed 2017/08/17 11:15PM

Space Sensitivity in SSIS Lookups

By Tim Mitchell, 2009/09/29

It's been well-documented through myriad blogs and forum posts about the case sensitivity of the comparisons in the SSIS lookup transformation (a good review can be found here). In a nutshell, a comparison using the lookup transformation is case sensitive when using the default setting of Full Cache, even if the values in the database are stored in a case insensitive collation. This happens because, in Full Cache mode, the comparison is done by the SSIS engine rather than the database engine, the former of which differentiates between "Value", "VALUE", and "VaLuE" regardless of the collation setting.

But did you know that this transformation is space sensitive as well? Consider the following T-SQL code:

SELECT Cast('char value' AS CHAR(20)) [val1]
INTO   test1

SELECT Cast('char value' AS CHAR(40)) [val2]
INTO   test2

SELECT t1.val1,
FROM   test1 t1
       INNER JOIN test2 t2
         ON t1.val1 = t2.val2

As you would expect, executing this code results in a successful match (INNER JOIN), even though we're comparing CHAR values of differing lengths (for more information, see this article for more information about spaces and padding in SQL Server).

However, when the same comparison is run through an SSIS lookup transformation in Full Cache mode, the lookup on our sample data will not result in a match. Similar to the case sensitive lookup, you'll find that the SSIS engine would treat the strings 'Hello World' and 'Hello World  ' (note the trailing spaces) as dissimilar values.  Unlike in SQL Server, trailing whitespace is significant in SSIS value comparisons.

As a workaround, you can use the TRIM() function in SSIS and the RTRIM() T-SQL function to insure that your comparisons are ignorant of trailing whitespace.  Alternatively, you could use a cache mode other than Full Cache, but you should be aware of the other implications before making such a change. 

Note that this behavior is limited to fixed-length character fields, but could lead to some unexpected and hard-to-detect problems if you aren’t aware of the behavior.

Copyright © 2002-2017 Redgate. All Rights Reserved. Privacy Policy. Terms of Use. Report Abuse.