Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

Upsert Dimension Table Expand / Collapse
Author
Message
Posted Tuesday, November 28, 2006 4:16 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Saturday, July 19, 2014 7:45 AM
Points: 6, Visits: 75

SSIS is memory hungry so is you use the LookUp trans will you will need to be mindful of how SSIS will chew all your memory.  I've used the process I outlined above on up to 10 million rows (which is not a lot) but SSIS manages this very well and very fast.

Simply build your package using Execute SQL taks between dataflows to faciliate my process.

Post #325938
Posted Tuesday, November 28, 2006 4:29 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, December 8, 2006 12:13 AM
Points: 8, Visits: 1
Oh cool! I should have thought of that before... thanks a lot :o)
Post #325943
Posted Tuesday, November 28, 2006 4:31 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, December 8, 2006 12:13 AM
Points: 8, Visits: 1
I'm not handling such large data. But will be mindful of that... thanks for the warning.
Post #325944
Posted Tuesday, November 28, 2006 4:33 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Thursday, April 24, 2014 12:00 AM
Points: 30, Visits: 523
Steve,
I think one of your indexes is slightly sick. We already know all about the sale. Now, we need the Upsert article, please.

(We all hope this is NOT a sign of things to come)

Regards,
Peter
Post #325945
Posted Tuesday, November 28, 2006 7:26 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: Administrators
Last Login: Yesterday @ 4:02 PM
Points: 33,062, Visits: 15,176
OK, OK, apologies. That was the weirdest thing. The process we use to load the articles is generating an old one on disk for some reason. Did it again this morning and I didn't catch it last night.

It's up there now.







Follow me on Twitter: @way0utwest

Forum Etiquette: How to post data/code on a forum to get the best help
Post #325997
Posted Tuesday, November 28, 2006 7:59 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, July 17, 2014 7:55 AM
Points: 2,805, Visits: 3,067

The surrograte key is an identity field, how can it have zero in it?

What I usually do is loading all the data in a staging table.  Then using left join to insert the new records into the dimension table. The dimension table does not need to have the surrogate key.

INSERT INTO Dim(Natural_key, attribute1, attribute2, load_time)
SELECT s.natural_key,  s.attribute1, s.attribute2, GETDATE()
FROM staging_table s
LEFT JOIN dw.dbo.dimension dim ON s.natural_key = dim.natural_key WHERE dim.natural_key is NULL

BTW, your step 2 insert statement, did you forget to insert the natural_key?

If the records are already in the dimension table, then update the record with the most recent information.

Update d set d.attitute1=s.attitute1, d.attitute2 = s.attitute2, d.load_time=GETDATE() FROM dw.dbo.dimension d inner join staging_table s where (d.natural_key = s.natural_key) and (d.attitute1 <> s.attitute1 or d.attitute2 <> s.attitute2)

I don't know what kind of benefit to do SCD3.  I think the effective date and expiration date is good enough in dimension table.  In SCD3 you are changing the dimension table structure.  What if the attribute changes 10 times, you would have a big table.

 

 

Post #326022
Posted Tuesday, November 28, 2006 8:52 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Friday, May 23, 2014 12:09 PM
Points: 155, Visits: 24
In SSIS you could use the <Slowly Changing Dimension> component which basically does the UPSERT.
Post #326070
Posted Tuesday, November 28, 2006 9:27 AM
Say Hey Kid

Say Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey KidSay Hey Kid

Group: General Forum Members
Last Login: Tuesday, October 22, 2013 11:18 AM
Points: 676, Visits: 432

Is there a reason you chose to use a list of "OR" vs CHECKSUM in your update statement? I realize CHECKSUM is not the same as CRC but it works just fine if comparing a single row.

Example:

UPDATE dim
SET dim.attribute1 = tmp.attribute1,
dim.attribute1 = tmp.attribute2,
dim.load_time = tmp.load_time
FROM dw.dbo.dimension dim
INNER JOIN #dim tmp
ON tmp.surrogate_key = dim.surrogate_key
WHERE tmp.surrogate_key <> 0
AND CHECKSUM(dim.attribute1,dim.attribute2) <> CHECKSUM(tmp.attribute1,tmp.attribute2)

This is cleaner to read for SCD Type 1 changes and, at least in our environment, seems to be a bit quicker.




Post #326101
Posted Tuesday, November 28, 2006 10:39 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, December 8, 2006 12:13 AM
Points: 8, Visits: 1

I tried using the Slowly Changing Dimension transormation. I basically need two paths - One to insert the new record ('New Output Path' to OLE DB Destination) and one to update changed attributes in an existing record based on a specified business key ('Changed Attribute Updates Output' to OLE DB Destination).

The insertion works fine but when i modify the same record in source data (retaining the same primary key) the data gets routed in the Changed Attribute path but fails with a primary key violation (it has attempted to insert into the destination table). Am i missing a setting or something that forces update rather than insert?

Post #326281
Posted Wednesday, November 29, 2006 6:10 AM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, July 17, 2014 7:55 AM
Points: 2,805, Visits: 3,067

cliffb,

I could use checksum, I was just in a hurry so I wrote it in using 'OR' especially there were only two values to check.  However I would not use surrograte key to compare when I update the dimension table, I would use the natural key.

Post #326334
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse