Partially vs Fully denormalized dimension and different facts granularities

  • Hi everyone,

    It was not easy to find the correct title for this topic, so let me try to explain the situation:

    Background:

    We have to fact tables: Retail Sales and Finance. The retail sales granularity includes the product level. The financial figures are not related to the products but to the business units.

    Each product belongs to a one and only one business unit. The relationships could be represented as follows:

    First approach:

    Our first approach is to have only one denormalized product dimension, in this case we want to have the business unit description in the product table. The drawback is we cannot join this product dimension table with the financial figures anymore.

    http://i623.photobucket.com/albums/tt313/pauldj54/Blog/joining%20facts%20and%20dim%202.png

    For this reason the reference to the business unit is kept in the table. As a result we have a partially normalized dimension table with around 30 normalized attributes and 3 foreign keys to another tables similar to the business unit case.

    The question:

    Is it a best practice? Is there another option?

    We also have some performance issues with our specific OLAP tool during the cube processing, if these relationships are resolve during processing time.

    Any advice would be highly appreciated.

    Kind Regards,

    Paul

    Paul Hernández
  • I would personally recommend the following:

    * Keep separate Business Unit and Product dimensions. You really don't have any other choice because of your fact tables.

    * Either have the Business Unit details (all of it) in the Product Dimension as well, or at least the natural key. With the natural key you will be able to join to the BU dimension if required. I'd opt to have all BU attributes in the Product dimension (to avoid snow-flaking), unless there is a good reason not to.

    * In your Retail Sales fact, add and populate the key of your BU dimension during ETL.

    This scenario happens frequently with geographical attributes as well. At first, you may want to keep all geo attributes in their respective dimensions (Employee, Customer, etc.)...but at some point you usually want users to be able to select sales for a specific geographical attribute without the constraints of going through the customer for instance.

    For that reason it probably makes sense to have a separate dimension (BU in your case), but still retain the attributes in their respective dimensions in case you want to build hierarchies (and of course for other fact tables that need the separate dimension).

  • Hi Martin,

    thanks so much for your useful answer. I just want to give you an example to see if I have understood correctly:

    Use Scenario:

    I decide to have a separated Business Unit dimension, I added the relevant attributes from this dimension also in the Products table. I added a populated the transactions with the key to both, products and bu dimensions.

    I am a business analyst and I want to analyze the retail sales by store and business units using Excel and power pivot connected to a OLAP SSAS Cube. I have two possible ways to include the business units due to my redundant design, I can drag and drop it from both, product and bu dimension, and I must get the same results.

    If I want to do the same but also including the financial figures, only the attributes coming from the bu dimension will break down the figures.

    Is this correct?

    I think is ok but the analyst must be aware of that in order to avoid possible inconsistencies.

    Paul Hernández
  • Yes, that is correct. A bit of a training issue, but that is not unusual.

    It would also be possible to use Scope Assignments to "lookup" the measures through the BU dimension, in the case that an end user wants to use the product dimension but look at financial numbers.

    This will however not change the back-end design we have just spoken about. I can definitely recommend that you read Chris Webb's blog post on the subject: https://cwebbbi.wordpress.com/2013/05/29/aggregating-the-result-of-an-mdx-calculation-using-scoped-assignments/

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply