Slowly Changing dimensions Design and ETL Help

  • I have 3 tables in OLTP like Account,Contact,Address related with foreignkey relationship and account has account type attribute, address has city,state,country,zip

    How I can design a Dimensional data store with this info. The basic business requirement is to find the no of Accounts from a particular zip code , keeping the history of count if the a customer moves from one location to another. I thought account as Fact , Address as slowly changing dimension with type 2, but then if an address changes then I have to add additional row in Account Fact and not a good design.

    What is the logic for relationship based on type 2 SCD.

    Any suggestion will be appreciated.

    Thanks

  • 3ilabsbiz (11/20/2010)


    I have 3 tables in OLTP like Account,Contact,Address related with foreignkey relationship and account has account type attribute, address has city,state,country,zip

    How I can design a Dimensional data store with this info. The basic business requirement is to find the no of Accounts from a particular zip code , keeping the history of count if the a customer moves from one location to another. I thought account as Fact , Address as slowly changing dimension with type 2, but then if an address changes then I have to add additional row in Account Fact and not a good design.

    What is the logic for relationship based on type 2 SCD.

    You do not add an additional row on the FACT table, you add an additional row on the DIM table.

    My favorite design for a Type II SCD DIM table is by adding three columns to the design:

    - Version

    - ValidFromDate

    - ValidUntilDate

    In my design version=0 is the current version while older versions go 1,2,3,etc.

    For rows with version > 0 ValidFromDate/ValidUntilDate tells you "when" that particular row was current.

    Hope this helps.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • That's quite an interesting approach with your "Version" field Paul. I can imagine that could prove very useful.

    You don't mention a "IsRowCurrent" field however, do you derive this at run time when making SCD changes to your Dims?

  • aaa-322853 (12/30/2010)


    That's quite an interesting approach with your "Version" field Paul. I can imagine that could prove very useful.

    You don't mention a "IsRowCurrent" field however, do you derive this at run time when making SCD changes to your Dims?

    In my design "Version=0" means "CurrentRow" - hope this clarifies 🙂

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Oops my mistake, didn't read your original post correctly!

    Typically when you have a type 2 you would identify this by comparing to you current record, in your case Version = 0, and then expire it before inserting the incoming record. How do you identify the next version number when expiring your outgoing record? Do you do anything clever here?

  • aaa-322853 (12/30/2010)


    Oops my mistake, didn't read your original post correctly!

    Typically when you have a type 2 you would identify this by comparing to you current record, in your case Version = 0, and then expire it before inserting the incoming record. How do you identify the next version number when expiring your outgoing record? Do you do anything clever here?

    😀 -- no problem, the same happens to me all the time.

    The beauty of SCD is that the number of changes per ETL cycle is a very small fraction of ETL's total volume so, solution doesn't have to be fantastic, just functional.

    When a SCD change happens I read version=0; insert it as version=max(version)+1 including ToTimeStamp; then apply change to version=0 including FromTimeStamp.

    Thank you for asking.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • I use a current record column which I call is_current. 1 = TRUE, 0 = FALSE. If I wanted to see the order in which that particular records' SCD2 value changed (for example if the record was a contacts address and I wanted to see the order in which they moved from state to state to state), I could simply sort by Start_Date (effective date).

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply