Basics of Dimensional Modeling

  • Comments posted to this topic are about the item Basics of Dimensional Modeling

  • Great topic, very short article. I think you should follow with more details. For example; try to explain why the year Dimension only has 2 columns.

  • This is a great little article for beginners that have no idea what a dimensional model is. This really helps break it down in a short simple explanation.

  • Good quick read. And I have a question on the specific statement made at the end of the article.

    First, you can run reports directly against the schema since the data is de-normalized and optimized for reporting.

    So, isn't the star formation created for the data actually "normalized"? Not, de-normalized?

    Maybe I'm all confused. Thanks.

  • piotrka (9/9/2014)


    Great topic, very short article. I think you should follow with more details. For example; try to explain why the year Dimension only has 2 columns.

    I agree. Now what?

    I suggest a 'basic' article maybe could also explain why 'dimensional' modeling is different from whatever else you might already know...

    then show us how/why queries against this schema perform better than queries I might have written before I learned this new strategy.

  • qbrt (9/9/2014)


    First, you can run reports directly against the schema since the data is de-normalized and optimized for reporting.

    So, isn't the star formation created for the data actually "normalized"? Not, de-normalized?

    In dimensional modeling, your fact table is highly normalized, and you typically try to represent the smallest level of transactional data (metric) that you can. In this case it is "one record per WEO subject per unit type per country per year" , this is called the grain of the table. The source spreadsheet data will be un-pivoted to store all those columns of years as a separate row each.

    Your dimension tables are typically designed to incorporate columns from multiple tables of your operational system, so that you don't have any descriptive data more than one join away from the fact table. Dimension tables from a source like this are split amongst the related attributes, a date/time scale (in this case we only have year), by geography (by country here), by WEO subject, by units. Typically you'd want all related fact data to be using the same scale, so I probably would have done that before loading the fact data as a cleansing step, avoiding a scale dimension. Having related fact data in the same scale makes it easier to report and do analysis on the data.

  • Chris Harshman (9/9/2014)


    qbrt (9/9/2014)


    First, you can run reports directly against the schema since the data is de-normalized and optimized for reporting.

    So, isn't the star formation created for the data actually "normalized"? Not, de-normalized?

    In dimensional modeling, your fact table is highly normalized, and you typically try to represent the smallest level of transactional data (metric) that you can. In this case it is "one record per WEO subject per unit type per country per year" , this is called the grain of the table. The source spreadsheet data will be un-pivoted to store all those columns of years as a separate row each.

    Your dimension tables are typically designed to incorporate columns from multiple tables of your operational system, so that you don't have any descriptive data more than one join away from the fact table. Dimension tables from a source like this are split amongst the related attributes, a date/time scale (in this case we only have year), by geography (by country here), by WEO subject, by units. Typically you'd want all related fact data to be using the same scale, so I probably would have done that before loading the fact data as a cleansing step, avoiding a scale dimension. Having related fact data in the same scale makes it easier to report and do analysis on the data.

    Ok. Thank you for that explanation. I see were I was confused. The fact table is in "fact" Normalized. From what you're saying,

    Your dimension tables are typically designed to incorporate columns from *multiple* tables...

    the de-normalization can occur in the dimension tables to keep the joins one step away. Although, that de-normalization is not seen in the article.

    Right? :unsure:

  • ... the de-normalization can occur in the dimension tables to keep the joins one step away. Although, that de-normalization is not seen in the article.

    Right? :unsure:

    Correct. If the dimensions were highly normalized, then you'd end up with a snowflake, instead of a star schema.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Alvin Ramard (9/9/2014)


    ... the de-normalization can occur in the dimension tables to keep the joins one step away. Although, that de-normalization is not seen in the article.

    Right? :unsure:

    Correct. If the dimensions were highly normalized, then you'd end up with a snowflake, instead of a star schema.

    LOL...:-D

    Love your refund policy.

    And thx for the reply.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply