|
|
|
Old Hand
      
Group: General Forum Members
Last Login: Monday, June 10, 2013 12:27 PM
Points: 375,
Visits: 950
|
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Thursday, April 18, 2013 9:05 AM
Points: 1,
Visits: 52
|
|
Hi Andy, Thanks a lot for detailed discussion.
But my concern in this implementation is related to slowness when the number of records are more. Unnecessary, this package will compare all the record in source table to all the records in destination table to find its state.
My suggestion is to use staging table which will contain only the delta changes in the source table and comparison should be done in delta records.
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: 2 days ago @ 2:46 PM
Points: 2,561,
Visits: 18,910
|
|
Similar question to dheeraj.marwaha, although not the same. Had to do this recently for a process that changed, and after building it using lookups and running into trouble because it wasn't set-based, I'm confused.
In short, if we're going to identify the add/change/delete records then do set-based updates from a staging table, why would we EVER want to use lookups in SSIS? Why not just grab the inbound file and load straight to a staging table, then compare in SQL? No need to write expressions, process the file to identify issues, etc?
--------------------------------------------------------- How best to post your question How to post performance problems Tally Table:What it is and how it replaces a loop
"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Friday, June 14, 2013 9:22 AM
Points: 28,
Visits: 193
|
|
| I do this sort of operation fairly frequently and just looking at the SSIS package for even the simple case in this example sets me on edge. There are plenty of tricks for doing this from a staging table, like comparing checksums rather than individual fields. Also our processes identify that a records needs updating and they report against what was updated on a field by field basis, including previous and new values for just the changed fields. I really don't see how you would do that in SSIS while maintaining any performance.
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Thursday, June 13, 2013 8:15 PM
Points: 162,
Visits: 437
|
|
All of the above comments boil down to: Should we use an ELT or ETL approach?
ETL being what SSIS is designed to do - transform the data in a stream in the ETL tool. ELT being how DTS used to work - load the data into staging tables into the database and do all the work in the database.
Its only worth using a ETL approach if we already know the state of the records in the source. Which in turn requires triggers and logs etc. in the source.
The RBAR update constraint basically means you need to make this a ELT approach. Then you may as well make the entire thing ELT rather than having to contend with two different approaches.
Thats my thoughts.
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Monday, January 09, 2012 3:37 AM
Points: 36,
Visits: 352
|
|
ddriver (10/12/2011) I do this sort of operation fairly frequently and just looking at the SSIS package for even the simple case in this example sets me on edge. There are plenty of tricks for doing this from a staging table, like comparing checksums rather than individual fields. Also our processes identify that a records needs updating and they report against what was updated on a field by field basis, including previous and new values for just the changed fields. I really don't see how you would do that in SSIS while maintaining any performance.
I agree with every you say here, particularly the checksum method. I would add though that for small, and always will be small update sets, the RBAR method in SSIS is useful and quick to implement. I usually use the SSIS RBAR for systme monitoring type process such as sweeping through jobs on all servers looking for failure times or abnormal durations, where change logging is not a requirement.
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Friday, February 15, 2013 11:52 AM
Points: 10,
Visits: 83
|
|
Hi Andy, thanks for the great stairway!
I'm having a problem with the condition in the Conditional Split. I copied it exactly as you have it:
(FirstName != LkUp_FirstName) || ((ISNULL(MiddleName) ? “Humperdinck” : MiddleName) != (ISNULL(LkUp_MiddleName) ? “Humperdinck” : LkUp_MiddleName)) || (LastName != LkUp_LastName)
But I get this error:
Error at Data Flow Task [Conditional Split [104]]: Attempt to parse the expression "(FirstName != LkUp_FirstName) || ((ISNULL(MiddleName) ? “Humperdinck” : MiddleName) != (ISNULL(LkUp_MiddleName) ? “Humperdinck” : LkUp_MiddleName)) || (LastName != LkUp_LastName)" failed. The token " " at line number "1", character number "57" was not recognized. The expression cannot be parsed because it contains invalid elements at the location specified.
Error at Data Flow Task [Conditional Split [104]]: Cannot parse the expression "(FirstName != LkUp_FirstName) || ((ISNULL(MiddleName) ? “Humperdinck” : MiddleName) != (ISNULL(LkUp_MiddleName) ? “Humperdinck” : LkUp_MiddleName)) || (LastName != LkUp_LastName)". The expression was not valid, or there is an out-of-memory error.
Error at Data Flow Task [Conditional Split [104]]: The expression "(FirstName != LkUp_FirstName) || ((ISNULL(MiddleName) ? “Humperdinck” : MiddleName) != (ISNULL(LkUp_MiddleName) ? “Humperdinck” : LkUp_MiddleName)) || (LastName != LkUp_LastName)" on "output "Updated Rows" (120)" is not valid.
Error at Data Flow Task [Conditional Split [104]]: Failed to set property "Expression" on "output "Updated Rows" (120)".
Character number "57" is the space before the first '?'. If I remove the space, it still complains but about the '?' itself. I am running 2008R2.
Any ideas?
|
|
|
|
|
Old Hand
      
Group: General Forum Members
Last Login: Monday, June 10, 2013 12:27 PM
Points: 375,
Visits: 950
|
|
There could be another character hiding out at character 57. My suggestion: Copy the expression to Notepad, make sure it's all on a single line, then copy from Notepad back into the Expression field.
Hope this helps, Andy
Andy Leonard CSO, Linchpin People Follow me on Twitter: @AndyLeonard
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Friday, February 15, 2013 11:52 AM
Points: 10,
Visits: 83
|
|
Andy,
I did that, and it did the trick. Thanks again!
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Friday, June 07, 2013 5:23 AM
Points: 1,
Visits: 21
|
|
I had the same Problem, when copying the expression over. It's the quotation marks.
(FirstName != LkUp_FirstName) || ((ISNULL(MiddleName) ? “Humperdinck” : MiddleName) != (ISNULL(LkUp_MiddleName) ? “Humperdinck“ : LkUp_MiddleName)) || (LastName != LkUp_LastName)
Delete these quotes around Humperdinck and type them back in. This should fix the problem. Another way like already mentioned is copying your expression over to Notepad. This will remove all formatting
|
|
|
|