foreign key relationship between dimensions and the facts

  • Jayanth_Kurup (2/28/2012)


    There is no substitute for doing proper data quality checks during the ETL. While there is a provision to convert to unknown member during processing of the cube and even the options of defaulting it within the database by assigning a default value as the surrogate key. They are not going to help you clean up after you have inserted the data into the cube.

    Once a fact assigns a unknown member to its key there is no way to differentiate it from previous unknown members , this means that once you do get the correct business key for the dimension you cant go back historically to rectify the issue.

    I might be better to no insert these facts into the table

    It depends. If the dimension's surrogate key takes part in the business key of the fact table then yes, you are absolutely right. If it is not, you can assign it to the unknown bucket without problems.

    If it does take part of the business key, I usually create an inferred member in the dimension, so I can still load my facts without issues.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • A caveat to your above scenario would be a Type 2 SCD which has changed multiple times between the data load.

    Jayanth Kurup[/url]

  • Can you elaborate more?

    It first sight this doesn't give a problem. The facts all have the same business key, so only 1 inferred member is generated.

    When the SCD2 dimension records finally arrive, the first one should update the inferred members, and the subsequent ones create new versions of that member.

    So it is more an issue of correctly loading the SCD2 dimension instead of loading the fact table.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • [So it is more an issue of correctly loading the SCD2 dimension instead of loading the fact table.

    That's what I meant 😀

    Jayanth Kurup[/url]

  • There is no substitute for doing proper data quality checks during the ETL. While there is a provision to convert to unknown member during processing of the cube and even the options of defaulting it within the database by assigning a default value as the surrogate key. They are not going to help you clean up after you have inserted the data into the cube.

    I certainly agree with this. It's not like I have a lot of orders drop into the unknown bucket, but it happens, usually because a value is absent. But some of my dimensions have fixed values. In the scheduled time dimension, for example, I have many combinations of start and end times. If a job comes in with a bizarre start/end, such as 1 am to 3 am, or an end time before a start time, I don't simply add what is clearly an incorrect entry. Instead the value becomes unknown and it's investigated.

  • PaulB-TheOneAndOnly (2/23/2012)


    deepak.a (2/23/2012)


    Is that mandatory to create foreign key relationship between dimensions and the facts in date warehosue ?

    Well... no but, how do you think optimizer learns about relationship between tables? You have to tell and the way to do it is by creating such FKs.

    My understanding is that optimiser uses the relationships created in BIDS DSV regardless of if there is an actual foreign key at the database level.

  • Ferrocarril (2/28/2012)


    PaulB-TheOneAndOnly (2/23/2012)


    deepak.a (2/23/2012)


    Is that mandatory to create foreign key relationship between dimensions and the facts in date warehosue ?

    Well... no but, how do you think optimizer learns about relationship between tables? You have to tell and the way to do it is by creating such FKs.

    My understanding is that optimiser uses the relationships created in BIDS DSV regardless of if there is an actual foreign key at the database level.

    You're talking about the DSV in SSAS, right? It's indeed easier if FK relationships are present, but you can create the relationships in the DSV as well. Or even skip that and create the relationships in the Dimension Usage tab of the cube. But the earlierm the better 🙂

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

Viewing 7 posts - 16 through 21 (of 21 total)

You must be logged in to reply to this topic. Login to reply