James is a big data and data warehousing technology specialist at Microsoft. He is a thought leader in the use and application of Big Data technologies, including MPP solutions involving hybrid technologies of relational data, Hadoop, and private and public cloud. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 30 years of IT experience. James is a popular blogger (JamesSerra.com) and speaker, having presented at dozens of PASS events including the PASS Business Analytics conference and the PASS Summit. He is the author of the book “Reporting with Microsoft SQL Server 2012”. He received a Bachelor of Science degree in Computer Engineering from the University of Nevada-Las Vegas.
A reference dimension occurs when the key column for the dimension is joined indirectly to the fact table through a key in another dimension table. This results in a snowflake schema design.
The following figure shows one fact table named InternetSales, and two dimension tables called Customer (regular or intermediate dimension) and Geography (reference dimension), in a snowflake schema:
Note that for performance reasons, it’s better to not use reference dimensions. Instead, merge the Geography info into the Customer table (see Denormalizing dimension tables). So the Customer table would add the fields City, StateProvidinceCode, and StateProvinceName populated by the ETL, resulting in one table.