Multiple dimensions Vs. Single dimension and hierarchy

  • Hi BI community,

    I just have a design question.

    Background: Our main dimension is the article dimension. It is pretty big, because the granularity of the table is one row per article number, size, season and country. Also every article belongs to one and only one style. The articles also have standard attributes like brand, division, etc. and country dependent attributes like landed costs, catalogue price and so on.

    One possible hierarchy would be Style --> Article --> Size

    There are two major report categories: Whole sales and Retail

    Whole sales reports go only to Style level

    Retail reports are more specific and could go to size level by store and day.

    The maintenance of the article table is slow regardless of the loading strategy that is used.

    From de multidimensional design point-of-view I would like to get your suggestions about these two models:

    Option 1:

    Option 2:

    For me one major advantage of the option 2 is the performance of the whole sales report, but in theory build hierarchies (option 1) enrich the multidimensional analysis. The loading process is even more complicated with the option 2.

    The last question is if it worth to keep the country dependent attributes in a separated table. I think this is worst because creates a snow flake schema and also the data model should win against the complexity of the ETL process. In other words, the data model should not be changed just to simplify the ETL load.

    Kind Regards

    Any comment will be highly appreciated.

    Paul Hernández
  • I've had a similar project recently, and in my opinion you should stay with option 1. Here's my reasons why:

    • The style and size attributes are part of the grain (business key) of your articles, and affect other attributes like cost and retail price. Separating these out into separate dimensions will force you to create more snowflakes to accommodate these dependent attributes
    • In my experience, it should still be possible to get to the size attribute for wholesale sales. On my specific project, it required jumping through a few hoops to get that level of detail...but it wasn't impossible. From a logical point of view, your client (and their operational systems) should know which sizes are sent to wholesale customers...even if they aren't interested in analyzing the data at that level

    One exception to your proposed design, is that I would separate country (along with any other geographical attributes like region, etc.) into its own dimension (Geo Dimension). Having one conformed dimension for geography gives you a lot more options in my opinion.

    To associate the geography dimension with other dimensions (like store, customer and article), it is important to ascertain whether the relationship is still required in the absence of a fact record. For instance, you will only have customers in your customer dimension if they have purchased something...and therefore it would not be necessary to have a reference in the customer dimension.

    If the association is required even in the absence of a fact record, I'd recommend the use of a fact-less fact table (or helper table as I'd like to call them). I also sometimes like to include only the business key in the referenced dimension as alternative. It makes the join a little more expensive because you're joining on business keys instead of surrogate keys, but worthwhile if you don't expect to use it very often.

    A good example of this would be employees. For certain types of transactional reporting, you may want to know in which country/state an employee lives (even though they have made no sales). In this situation and depending on your exact requirements, you may choose to include only the zip code in the employee dimension (I'm simplifying for the sake of the example)...which in turn will allow you to join to the geography dimension for that specific reason.

  • I would go with the second option because it allows you more flexibility as you don't depend on traversing the hierarchy using the same route each time. I'm assuming that Style, Article and Size are independent from each other and you can have the same size within different articles and styles. Using a hierarchy, each instance of size will be different and it will create more problems when you try to get a report by size.

    This is unless the 3 dimensions are small enough for a junk dimension to make sense.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis Cazares (8/3/2015)


    I'm assuming that Style, Article and Size are independent from each other and you can have the same size within different articles and styles.

    That's just the thing...these 3 attributes are not independent of each other. From my experience in retail environments, an article is defined by the style, size and the sometimes season and fabric as well (if it is clothing). IMO option 2 would not make sense in that case, because you would still need to include style and size attributes in the article dimension.

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply