Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Surrogat Key be Created in DW or Stage? Expand / Collapse
Author
Message
Posted Saturday, July 7, 2012 3:17 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Monday, June 9, 2014 1:21 AM
Points: 58, Visits: 191
Should the surrogate key, for instance customerKey for dimension customer, be created in the stage process and then the all data should be loaded into the dimCustomer or should the customerkey be created automatically in the dimCustomer when you are loading the data to the dimCustomer table?

What is your reflection and experience?
Post #1326402
Posted Thursday, July 12, 2012 10:08 AM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, June 27, 2014 2:05 PM
Points: 300, Visits: 810
akirajt (7/7/2012)
Should the surrogate key, for instance customerKey for dimension customer, be created in the stage process and then the all data should be loaded into the dimCustomer or should the customerkey be created automatically in the dimCustomer when you are loading the data to the dimCustomer table?

What is your reflection and experience?


We use the Kimball Component and it creates our surrogate key and I output that data flow to Staging. If all checks out with our validation, I replace live table with Staging.
Post #1329004
Posted Friday, July 13, 2012 6:09 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Tuesday, January 28, 2014 8:15 AM
Points: 3,068, Visits: 4,639
herladygeekedness (7/12/2012)
akirajt (7/7/2012)
Should the surrogate key, for instance customerKey for dimension customer, be created in the stage process and then the all data should be loaded into the dimCustomer or should the customerkey be created automatically in the dimCustomer when you are loading the data to the dimCustomer table?

What is your reflection and experience?


We use the Kimball Component and it creates our surrogate key and I output that data flow to Staging. If all checks out with our validation, I replace live table with Staging.


I agree the approach works fine in an environment where the full DIM table gets replaced every time for the Staging one but, in a standard ETL process where DIM table gets incrementally loaded with the daily content of Staging I would create the surrogate key at the time new rows get inserted into DIM table - not at the staging phase, I would keep staging table as close - in terms of schema - as the OLTP source table is.

Hope this helps.


_____________________________________
Pablo (Paul) Berzukov

Author of Understanding Database Administration available at Amazon and other bookstores.

Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
Post #1329395
Posted Friday, July 13, 2012 1:30 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, December 12, 2012 6:28 AM
Points: 8, Visits: 91
I agree with Pablo, but it also depends on how you are handling slowly changing dimensions, if at all.

I personally only use the natural key(s) in the staging and SK's in the final table.

Jake
Post #1329650
Posted Friday, July 13, 2012 2:45 PM
Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Friday, June 27, 2014 2:05 PM
Points: 300, Visits: 810
We do incremental, SCD1, we simply don't insert, we replace existing dimension table with the newly staged table. Partition switches occur very fast, inserts may or may not be swift.

Kimball takes existing dimension and preserves the keys as it determines changes from source system. Kimball outputs show changed, new, deleted, etc (altho we do not delete) and attaches next available key to new records.

It's a very easy way to do things with little downside that I've seen. It's made development very rapid and the table swaps have never failed.
Post #1329666
Posted Friday, July 27, 2012 11:21 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 4:03 AM
Points: 440, Visits: 3,277
akirajt (7/7/2012)
Should the surrogate key, for instance customerKey for dimension customer, be created in the stage process and then the all data should be loaded into the dimCustomer or should the customerkey be created automatically in the dimCustomer when you are loading the data to the dimCustomer table?


Populate surrogate keys in staging. Otherwise you have to do joins back to the warehouse / mart tables while you are populating the referencing tables. Pre-populating the surrogate keys and the references to them minimises the number and size of joins and shortens the transaction during which the DW is being touched.

Avoid using IDENTITY columns in the data warehouse. If you are using SQL Server 2012 then use sequences instead.




David
Post #1336867
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse