Slowly Changing Facts

  • sneumersky (10/18/2013)


    2. Allows for incremental processing of multidimensional Analysis Services cubes instead of an expensive full reprocessing of a cube.

    It is obvious you have been there and come back with knowledge.

    Thank you for sharing!

  • jcraigue@cox.net (10/18/2013)


    A minor point, but not everything is a 'design pattern', nor does it need to be to be useful.

    Just trying to prevent more abuse of the language...

    Design Pattern has to be the new red headed step child of Software Architecture Language. It seems to be abused and used in the wrong way more than "Extreme Application Development".

  • Let's assume we don't use SCD2 on the fact table.

    ID SalesPersonID ProductID SaleMonth Quantity Revenue

    110110012012-0110100

    210110022012-01464

    310110032012-01336

    Then we bring the table to some Reporting tool like BO, OBIEE, Cognos.

    We define the aggregates:

    SUM(Revenue) Revenue_SUM,

    AVG(Revenue) Revenue_AVG

    Now users can design the report by grabbing these elements:

    SalesPersonID, SaleMonth, Revenue_Revenue_SUM, Revenue_AVG

    This will be the average of all sales per Month and Person.

    Reporting tool will generate a query like this:

    SELECT SalesPersonID, ProductID, SaleMonth, SUM(Revenue), AVG(Revenue)

    FROM fact.SalesDetail

    GROUP BY SalesPersonID, ProductID, SaleMonth.

    Statistics: 3 deals, 3 products, total quantity 17

    Revenue_SUM will be (100+64+36) =200

    Revenue_AVG will be (100+64+36)/3 =66.67

    Now, let's assume there was a mistake in the source and it's corrected with SCD2.

    ID SalesPersonID ProductID SaleMonth Quantity Revenue Audit SourceID

    110110012012-012020011

    210110022012-0146412

    310110032012-0133613

    410110012012-01-10-10021

    Statistics: 3 deals, 3 products, total quantity 17

    Revenue_SUM will be (200-100+64+36) =200

    Revenue_AVG will be (200-100+64+36)/4 =50

    It's all the same, but AVG is wrong.

    Well, it's still possible to figure out some of these aggregates.

    For Revenue_AVG would be:

    SUM(Revenue) / (COUNT(*) - COUNT(DISTINCT Audit) +1)

    Would you be able to write one to show the MIN (Revenue) ?

    In this case you need to SUM first to get the full amounts and only then do the MIN.

    This is not doable using Reporting GUI for ordinary users.

    I don't even dare to think of some more complicated analytical functions here like RunningTotals and etc.

    Luckily there are other better solutions.

  • This topic itself is nice and the techniques mentioned in this article can be used in many occasions.

    However "change a dimension ID to see what happens" section does not cover the situation when the staging fact table already has a transaction (record) with same sale person id , product id and date.

    The following update statement will create a record with a duplicate key (100, 1004, 20101102), if allowed.

    UPDATE staging.SalesDetailSource

    SET SalesPersonID = 100

    WHERE ProductID = 1004

    AND SalesPersonID = 101

    AND SaleDate = 20101102;

    Any explanation on how to handle this situation will be interesting.

    Thanks.

  • scrrbiswas (10/28/2013)


    This topic itself is nice and the techniques mentioned in this article can be used in many occasions.

    However "change a dimension ID to see what happens" section does not cover the situation when the staging fact table already has a transaction (record) with same sale person id , product id and date.

    The following update statement will create a record with a duplicate key (100, 1004, 20101102), if allowed.

    UPDATE staging.SalesDetailSource

    SET SalesPersonID = 100

    WHERE ProductID = 1004

    AND SalesPersonID = 101

    AND SaleDate = 20101102;

    Any explanation on how to handle this situation will be interesting.

    Thanks.

    Depends on if you want a Historical record for this dimension table update.

    Is this just a situation where you want to update the Business Key Id in the dim table when it changes in the source?

    Maybe some details on what is a Suragate Key and a Business Key in your statement above?

    You are using Suragate keys in your Dimension tables right?

Viewing 5 posts - 16 through 19 (of 19 total)

You must be logged in to reply to this topic. Login to reply