Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Tim Mitchell

Tim Mitchell is a business intelligence consultant, author, trainer, and SQL Server MVP with over a decade of experience. Tim is the principal of Tyleris Data Solutions and is a Linchpin People teammate. Tim has spoken at international, regional, and local venues including the SQL PASS Summit, SQLBits, SQL Connections, SQL Saturday events, and various user groups and webcasts. He is a board member at the North Texas SQL Server User Group in the Dallas area. Tim is coauthor of the book SSIS Design Patterns, and is a contributing author on MVP Deep Dives 2. You can visit his website and blog at TimMitchell.net or follow him on Twitter at twitter.com/Tim_Mitchell.

Space Sensitivity in SSIS Lookups

It's been well-documented through myriad blogs and forum posts about the case sensitivity of the comparisons in the SSIS lookup transformation (a good review can be found here). In a nutshell, a comparison using the lookup transformation is case sensitive when using the default setting of Full Cache, even if the values in the database are stored in a case insensitive collation. This happens because, in Full Cache mode, the comparison is done by the SSIS engine rather than the database engine, the former of which differentiates between "Value", "VALUE", and "VaLuE" regardless of the collation setting.

But did you know that this transformation is space sensitive as well? Consider the following T-SQL code:

SELECT Cast('char value' AS CHAR(20)) [val1]
INTO   test1

SELECT Cast('char value' AS CHAR(40)) [val2]
INTO   test2

SELECT t1.val1,
       t2.val2
FROM   test1 t1
       INNER JOIN test2 t2
         ON t1.val1 = t2.val2


As you would expect, executing this code results in a successful match (INNER JOIN), even though we're comparing CHAR values of differing lengths (for more information, see this article for more information about spaces and padding in SQL Server).

However, when the same comparison is run through an SSIS lookup transformation in Full Cache mode, the lookup on our sample data will not result in a match. Similar to the case sensitive lookup, you'll find that the SSIS engine would treat the strings 'Hello World' and 'Hello World  ' (note the trailing spaces) as dissimilar values.  Unlike in SQL Server, trailing whitespace is significant in SSIS value comparisons.

As a workaround, you can use the TRIM() function in SSIS and the RTRIM() T-SQL function to insure that your comparisons are ignorant of trailing whitespace.  Alternatively, you could use a cache mode other than Full Cache, but you should be aware of the other implications before making such a change. 

Note that this behavior is limited to fixed-length character fields, but could lead to some unexpected and hard-to-detect problems if you aren’t aware of the behavior.

Comments

Posted by Steve Jones on 29 September 2009

I did not know that. However that makes more sense to me. If I want matches, then I want to have exact matches. This has been a problem at time in SQL Server for me when it removes spaces that I want considered, or lengths are unknowningly different.

I wish we had an = and a == that would allow us both behaviors in the db engine.

Posted by robp on 30 September 2009

Steve,

Try

SELECT t1.val1,

      t2.val2

FROM   test1 t1

      INNER JOIN test2 t2  

      ON t1.val1 like t2.val2

Leave a Comment

Please register or log in to leave a comment.