grain of a table

Question

Post reply

grain of a table

etl2016

Default port

Points: 1459
More actions
November 18, 2016 at 3:50 pm

#331936

Hi,
To me, grain is the mechanism how you can get 1 single Unique row in any table. For example, the grain of a SALES fact table might be stated as "Sales volume by Day by Product by Store". Each record in this fact table is therefore uniquely defined by a day, product and store. Here DATE, PRODUCT and STORE are the Dimension tables.
Q1) Is "grain" the characteristic feature of a FACT table ONLY?
Q2) Given a STAR-schema-DIM or STAR-schema-FACT or ER-Table how do we build a query to identify the Grain of that table?
thanks

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

pietlinden SSC Guru Points: 63425 More actions · Answer 1

Your question forced me to look up the definition of the grain of a table... good thing to do.

Identifying the level of detail

The level of detail that is available in a star schema is known as the grain. Each fact and dimension table has its own grain or granularity. Each table (either fact or dimension) contains some level of detail that is associated with it. The grain of the dimensional model is the finest level of detail that is implied when the fact and dimension tables are joined. For example, the granularity of a dimensional model that consists of the dimensions Date, Store, and Product is product sold in store by day."

Does this answer your question?

etl2016 Default port Points: 1459 More actions · Answer 2

thanks for the reply.

Actually, I am trying to quantify the feature "grain of a table".

Say, you give "a table" and I should be able to tell its grain, irrespective of whether or not its a FACT or DIM or old school non-Star-schema ER table.

From Kimball's article http://www.kimballgroup.com/2003/03/declaring-the-grain/ .......Perhaps you were expecting the grain to be a traditional declaration of the fact table’s primary key. Although the grain ultimately is equivalent to the primary key.......I think the answer to my question is GRAIN of "a table" = the SQL query that is capable of fetching each row uniquely...

To quantify further, SELECT X, Y, z, count (*) GROUP BY X, Y, Z HAVING COUNT (*) = 1 should give the GRAIN of "a table" irrespective of its a FACT or DIM or ER table.

Is this quantification correct?

etl2016 Default port Points: 1459 More actions · Answer 3

etl2016 (11/18/2016)
To quantify further, SELECT X, Y, z, count (*) GROUP BY X, Y, Z HAVING COUNT (*) = 1 should give the GRAIN of "a table" irrespective of its a FACT or DIM or ER table.
Is this quantification correct?

If HAVING COUNT (*) = 1, then X, Y, Z collectively is the grain of that table.....agree?

Chris Harshman SSC-Forever Points: 42192 More actions · Answer 4

etl2016 (11/18/2016)
etl2016 (11/18/2016)
To quantify further, SELECT X, Y, z, count (*) GROUP BY X, Y, Z HAVING COUNT (*) = 1 should give the GRAIN of "a table" irrespective of its a FACT or DIM or ER table.
Is this quantification correct?
If HAVING COUNT (*) = 1, then X, Y, Z collectively is the grain of that table.....agree?

The grain of a table is better defined by some business terminology than technical SQL terminology. Do you have an existing data warehouse that you inherited without any documentation? If so then some exploration and/or discussion may be needed. The problem with assuming the grain by unique query is that there may be multiple queries that seem to indicate unique combinations of your data, but only one of them was probably intended as the grain.