|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Friday, April 05, 2013 6:01 AM
Points: 6,
Visits: 43
|
|
SSIS is memory hungry so is you use the LookUp trans will you will need to be mindful of how SSIS will chew all your memory. I've used the process I outlined above on up to 10 million rows (which is not a lot) but SSIS manages this very well and very fast. Simply build your package using Execute SQL taks between dataflows to faciliate my process.
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Friday, December 08, 2006 12:13 AM
Points: 8,
Visits: 1
|
|
| Oh cool! I should have thought of that before... thanks a lot :o)
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Friday, December 08, 2006 12:13 AM
Points: 8,
Visits: 1
|
|
| I'm not handling such large data. But will be mindful of that... thanks for the warning.
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Thursday, May 16, 2013 12:48 AM
Points: 25,
Visits: 505
|
|
Steve, I think one of your indexes is slightly sick. We already know all about the sale. Now, we need the Upsert article, please.
(We all hope this is NOT a sign of things to come)
Regards, Peter
|
|
|
|
|
SSC-Dedicated
           
Group: Administrators
Last Login: Today @ 6:14 PM
Points: 31,421,
Visits: 13,734
|
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Friday, May 17, 2013 7:05 AM
Points: 2,726,
Visits: 2,925
|
|
The surrograte key is an identity field, how can it have zero in it? What I usually do is loading all the data in a staging table. Then using left join to insert the new records into the dimension table. The dimension table does not need to have the surrogate key. INSERT INTO Dim(Natural_key, attribute1, attribute2, load_time) SELECT s.natural_key, s.attribute1, s.attribute2, GETDATE() FROM staging_table s LEFT JOIN dw.dbo.dimension dim ON s.natural_key = dim.natural_key WHERE dim.natural_key is NULL BTW, your step 2 insert statement, did you forget to insert the natural_key? If the records are already in the dimension table, then update the record with the most recent information. Update d set d.attitute1=s.attitute1, d.attitute2 = s.attitute2, d.load_time=GETDATE() FROM dw.dbo.dimension d inner join staging_table s where (d.natural_key = s.natural_key) and (d.attitute1 <> s.attitute1 or d.attitute2 <> s.attitute2) I don't know what kind of benefit to do SCD3. I think the effective date and expiration date is good enough in dimension table. In SCD3 you are changing the dimension table structure. What if the attribute changes 10 times, you would have a big table.
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Tuesday, June 26, 2012 2:39 PM
Points: 155,
Visits: 23
|
|
| In SSIS you could use the <Slowly Changing Dimension> component which basically does the UPSERT.
|
|
|
|
|
Say Hey Kid
      
Group: General Forum Members
Last Login: Friday, May 03, 2013 11:41 AM
Points: 675,
Visits: 426
|
|
Is there a reason you chose to use a list of "OR" vs CHECKSUM in your update statement? I realize CHECKSUM is not the same as CRC but it works just fine if comparing a single row. Example: UPDATE dim SET dim.attribute1 = tmp.attribute1, dim.attribute1 = tmp.attribute2, dim.load_time = tmp.load_time FROM dw.dbo.dimension dim INNER JOIN #dim tmp ON tmp.surrogate_key = dim.surrogate_key WHERE tmp.surrogate_key <> 0 AND CHECKSUM(dim.attribute1,dim.attribute2) <> CHECKSUM(tmp.attribute1,tmp.attribute2) This is cleaner to read for SCD Type 1 changes and, at least in our environment, seems to be a bit quicker.
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Friday, December 08, 2006 12:13 AM
Points: 8,
Visits: 1
|
|
I tried using the Slowly Changing Dimension transormation. I basically need two paths - One to insert the new record ('New Output Path' to OLE DB Destination) and one to update changed attributes in an existing record based on a specified business key ('Changed Attribute Updates Output' to OLE DB Destination). The insertion works fine but when i modify the same record in source data (retaining the same primary key) the data gets routed in the Changed Attribute path but fails with a primary key violation (it has attempted to insert into the destination table). Am i missing a setting or something that forces update rather than insert?
|
|
|
|
|
SSCrazy
      
Group: General Forum Members
Last Login: Friday, May 17, 2013 7:05 AM
Points: 2,726,
Visits: 2,925
|
|
cliffb, I could use checksum, I was just in a hurry so I wrote it in using 'OR' especially there were only two values to check. However I would not use surrograte key to compare when I update the dimension table, I would use the natural key.
|
|
|
|