Basics of Dimensional Modeling

Question

Basics of Dimensional Modeling

derekwilson74

SSC Rookie

Points: 49
More actions
September 8, 2014 at 11:20 pm

#295936

Comments posted to this topic are about the item Basics of Dimensional Modeling

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

piotrka SSCrazy Points: 2031 More actions · Answer 1

Great topic, very short article. I think you should follow with more details. For example; try to explain why the year Dimension only has 2 columns.

richard.dangelo Grasshopper Points: 11 More actions · Answer 2

This is a great little article for beginners that have no idea what a dimensional model is. This really helps break it down in a short simple explanation.

qbrt SSCrazy Points: 2737 More actions · Answer 3

Good quick read. And I have a question on the specific statement made at the end of the article.

First, you can run reports directly against the schema since the data is de-normalized and optimized for reporting.

So, isn't the star formation created for the data actually "normalized"? Not, de-normalized?

Maybe I'm all confused. Thanks.

Mike Dougherty-384281 SSCrazy Points: 2794 More actions · Answer 4

piotrka (9/9/2014)
Great topic, very short article. I think you should follow with more details. For example; try to explain why the year Dimension only has 2 columns.

I agree. Now what?

I suggest a 'basic' article maybe could also explain why 'dimensional' modeling is different from whatever else you might already know...

then show us how/why queries against this schema perform better than queries I might have written before I learned this new strategy.

Chris Harshman SSC-Forever Points: 42191 More actions · Answer 5

qbrt (9/9/2014)
First, you can run reports directly against the schema since the data is de-normalized and optimized for reporting.
So, isn't the star formation created for the data actually "normalized"? Not, de-normalized?

In dimensional modeling, your fact table is highly normalized, and you typically try to represent the smallest level of transactional data (metric) that you can. In this case it is "one record per WEO subject per unit type per country per year" , this is called the grain of the table. The source spreadsheet data will be un-pivoted to store all those columns of years as a separate row each.

Your dimension tables are typically designed to incorporate columns from multiple tables of your operational system, so that you don't have any descriptive data more than one join away from the fact table. Dimension tables from a source like this are split amongst the related attributes, a date/time scale (in this case we only have year), by geography (by country here), by WEO subject, by units. Typically you'd want all related fact data to be using the same scale, so I probably would have done that before loading the fact data as a cleansing step, avoiding a scale dimension. Having related fact data in the same scale makes it easier to report and do analysis on the data.

qbrt SSCrazy Points: 2737 More actions · Answer 6

Chris Harshman (9/9/2014)
qbrt (9/9/2014)
First, you can run reports directly against the schema since the data is de-normalized and optimized for reporting.
So, isn't the star formation created for the data actually "normalized"? Not, de-normalized?
In dimensional modeling, your fact table is highly normalized, and you typically try to represent the smallest level of transactional data (metric) that you can. In this case it is "one record per WEO subject per unit type per country per year" , this is called the grain of the table. The source spreadsheet data will be un-pivoted to store all those columns of years as a separate row each.
Your dimension tables are typically designed to incorporate columns from multiple tables of your operational system, so that you don't have any descriptive data more than one join away from the fact table. Dimension tables from a source like this are split amongst the related attributes, a date/time scale (in this case we only have year), by geography (by country here), by WEO subject, by units. Typically you'd want all related fact data to be using the same scale, so I probably would have done that before loading the fact data as a cleansing step, avoiding a scale dimension. Having related fact data in the same scale makes it easier to report and do analysis on the data.

Ok. Thank you for that explanation. I see were I was confused. The fact table is in "fact" Normalized. From what you're saying,

Your dimension tables are typically designed to incorporate columns from *multiple* tables...

the de-normalization can occur in the dimension tables to keep the joins one step away. Although, that de-normalization is not seen in the article.

Right? :unsure:

Alvin Ramard SSC-Forever Points: 41190 More actions · Answer 7

... the de-normalization can occur in the dimension tables to keep the joins one step away. Although, that de-normalization is not seen in the article.
Right? :unsure:

Correct. If the dimensions were highly normalized, then you'd end up with a snowflake, instead of a star schema.

Alvin Ramard
Memphis PASS Chapter[/url]

All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

qbrt SSCrazy Points: 2737 More actions · Answer 8

Alvin Ramard (9/9/2014)
... the de-normalization can occur in the dimension tables to keep the joins one step away. Although, that de-normalization is not seen in the article.
Right? :unsure:
Correct. If the dimensions were highly normalized, then you'd end up with a snowflake, instead of a star schema.

LOL...:-D

Love your refund policy.

And thx for the reply.