July 9, 2009 at 11:34 pm
Hi Guys,
I have a SSIS Data Flow Task with a Lookup in it. The comparison is between two strings and i vet the data before comparison
by vetting it i mean
1. change it from a DT_WSTR to a DT_STR
2. Run TRIM(Column) or RTRIM(LTRIM(Column)) on it (i have tried both)
the input data is from a dbase using ODBC ADO.NET connector.
Comparison is via a OLEDB MSSQL data source.
Now most of my rows are fine but i have two that slip through the cracks and don't match up.
I can match them manually by copying the data from the source column and doing a lookup in the comparison table and it works but ssis just doesn't want to match.
Is there anything i should know about the Lookup Transform that could be causing this? I tried Fuzzy lookup but i am more looking for an exact match (the data is customer id's) and i also need the non matched data so i can insert it into the comparison table.
Any idea's? I'm pulling my head out trying to fix this.
Thanks
P
July 10, 2009 at 6:51 am
The usual types of errors with this (I can't count how many times I've had to troubleshoot my packages for this type of scenario) are non matching data types, char data types, white space, invisible control characters, case sensitivity, column mappings, wrong servers, wrong tables etc
Check your reference SQL, check your error rows in the dataviewer and compare with the expected rows in the lookup table. Do either of your source or lookup tables have any of the characteristics mentioned above?
HTH
Kindest Regards,
Frank Bazan
July 11, 2009 at 8:31 pm
A couple of things to check. First, if the case of the data is different (upper versus lower or vice versa), you could have some issues. By default the SSIS lookup uses full cache mode, which results in a case-sensitive lookup regardless of the database collation. More information here: http://blogs.conchango.com/jamiethomson/archive/2008/02/12/SSIS_3A00_-Case_2D00_sensitivity-in-Lookup-component.aspx
Second, it may be possible that you have some unprintable characters in your lookup value. If you have some characters such as an ASCII carriage return, newline, etc., in one of the values, they may look identical when you look at them but will likely not match on the lookup. There are some text editors that will show a default character (like a block or a question mark) if it encounters an unprintable byte - you could copy both source and lookup values in there to see if they match.
hth,
Tim
Tim Mitchell
TimMitchell.net | @tmitch.net | Tyleris.com
ETL Best Practices
July 13, 2009 at 2:31 am
Thanks Guys,
1. I have replaced the field in both database ensuring no special characters are inserted and they are exactly the same.
2. Both strings are uppercase.
Any other ideas?
Thanks
P
July 13, 2009 at 3:23 am
Have you been able to isolate the offending records? Is it possible they are of different lengths? Could there be truncation going on?
You are converting to a non-unicode datatype... are they special characters specific to a particular code page?
How have you verified there are no control characters? If you output the offending rows to a file, you should be able to view them in SSIS binary editor. File > Open (In open dialog click open with) A CRLF is encoded as 0D 0A.
Also in SQL you can search using CHARINDEX(CHAR(10),mycolumn) > 0 (replace char(10) with char(13) for line feed).
Basically if they appear to you to be the same ( i.e. Same case and same word), then the only thing left is what you can't see.
HTH
Kindest Regards,
Frank Bazan
July 13, 2009 at 3:41 am
Thanks Frank
1. I sent the output of the query to a file on the source side
2. i checked this source data in a hex editor and there were no extra characters
3. I did a sql query for the field in the comparison db using MS SQL Server Managment Studio
4. I copied the field from MS SQL Server Managment Studio to a txt file opened with notepad (key strokes x + Ctrl c + x so i could have a delimiter)
5. Compared both output's in a hex editor and they are identical.
🙁
July 22, 2009 at 3:24 am
Hi Guys,
I took the wait and see attitude and i found the solution.
It turns out my rows where getting imported without using trim on them and the row was coming from a datasource that used spaces to pad out the field length.
I wrote a stored procedure to clean up the table's row's for that field useing ltrim(rtrim and made sure i used trim on the field before the lookup and it fixed it!.
Must have been a non space char in the field after the final letter.
Thanks for all your help, i don't know why i couldn't see this in the table dump before...
Cheers
P
Viewing 7 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply