• herladygeekedness (7/12/2012)


    akirajt (7/7/2012)


    Should the surrogate key, for instance customerKey for dimension customer, be created in the stage process and then the all data should be loaded into the dimCustomer or should the customerkey be created automatically in the dimCustomer when you are loading the data to the dimCustomer table?

    What is your reflection and experience?

    We use the Kimball Component and it creates our surrogate key and I output that data flow to Staging. If all checks out with our validation, I replace live table with Staging.

    I agree the approach works fine in an environment where the full DIM table gets replaced every time for the Staging one but, in a standard ETL process where DIM table gets incrementally loaded with the daily content of Staging I would create the surrogate key at the time new rows get inserted into DIM table - not at the staging phase, I would keep staging table as close - in terms of schema - as the OLTP source table is.

    Hope this helps.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.