Exam Prep 70-463: Tables and Schemas

Colleen M. Morrow, 2014-11-25 (first published: 2014-11-17)

Well, the PASS Summit has come and gone and it was *awesome*. What a difference from my first Summit experience, when I went to every session I could and then went back to the hotel. This time around there was something going on every evening and I made the most of it. I think the fact that I slept for 14 hours when I got home on Saturday speaks volumes.

Besides sessions and networking, I also managed to pass 2 of my certification exams last week. That leaves the much-dreaded 70-463 exam. Dun-Dun-Duuunnnn! So it’s time to start preparing for that exam, and as I go along I’ll post here so you can study along with me. Ready?

Exam 70-463 focuses on implementing a data warehouse in SQL Server. Obviously this is a pretty big topic that goes way beyond even what the exam covers, but let’s set the foundation with some fundamentals. Most of us are very familiar with OLTP database design. It’s highly normalized and great for line of business (LOB) applications that are primarily using lots of small transactions of very short duration. The problem with a normalized design is that it’s not so great for analytical reporting. First of all, normalization means joining many tables which leads to poorer query performance and also makes for much more complex queries. The other problem is that all of your LOB data reside in different databases, so comparing or aggregating data between different systems is difficult. To get around this, we build a data warehouse where all of our disparate OLTP data can be merged for analysis and historical reporting. What makes the data warehouse so great for this type of reporting is its fundamental design, and that’s what we’re going to cover in this post.

Tables

There are two main types of tables in the data warehouse environment: fact tables and dimension tables.

Fact – Fact tables store quantitative information for analysis or reporting. An example of this would be a sales table like the FactInternetSales table in the AdventureWorksDW2012 database. Every row in the FactInternetSales table records a sale that happened. Customer A purchased Product B on date C, and so on. However, your typical sales fact table wouldn’t contain the actual customer information. Take a look at FactInternetSales and you won’t see customer data. Rather it stores a foreign key reference to the customer data. And that’s where dimension tables come in.
Dimension – A dimension table contains descriptive attributes that can be used to filter, group, and label facts in the fact table. They provide context. For example, the FactInternetSales table mentioned earlier would likely be joined to a Customers dimension table, as well as a Products dimension. The more dimensions a fact table is associated with, the more ways you can slice the data. This is referred to as dimensionality or granularity.

Schemas

It’s the logical design or schema of the data warehouse, or how those fact and dimension tables are laid out and connected to each other, that makes the data warehouse so great for reporting. And your data warehouse will typically use one of two schemas: the Star schema and the Snowflake schema.

Star schema – in a Star schema, a central fact table is surrounded by a single layer of dimension tables. One Star schema will cover one business area in the DW, but you can (and probably will) have multiple Star schemas in your DW. These schemas should be connected by sharing dimension tables (shared dimensions). If the various Star schemas in your DW don’t share dimension tables, you lose the connection between fact tables, making it difficult, if not impossible, to compare data between different fact tables.

Star schema example

Snowflake schema – a Snowflake schema will start out as a Star schema, but then some of the dimensions are normalized, creating the snowflake-like pattern. An advantage of the Snowflake schema is that is can make hierarchies more apparent. For example, a Product dimension might be normalized to join to a ProductSubCategory dimension, and that might in turn be joined to a ProductCategory dimension. In general, Snowflake schemas are discouraged for performance (more joins) and maintainability reasons.

The Snowflake schema starts to add complexity

Summary

Today we covered the two primary types of table found in the data warehouse, the fact table that holds your quantitative data and the dimension table that holds your qualitative, contextual data. We also reviewed the two most common schema designs, the Star schema and the Snowflake schema. Next time we’ll talk more about dimension tables, their columns, and how to handle updates.

Book Review: Big Red - Voyage of a Trident Submarine

by Andy Warren

SQLServerCentral.com

Blogs

I've grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It's a fairly human look at what's involved in sailing on a Trident missile submarine...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-03-10

1,439 reads

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

by Robert Davis

SQLServerCentral.com

Blogs

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-23

1,567 reads

Inserting Markup into a String with SQL

by Phil Factor

SQLServerCentral.com

T-SQL

In which Phil illustrates an old trick using STUFF to intert a number of substrings from a table into a string, and explains why the technique might speed up your code...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-18

1,631 reads

Networking - Part 4

by Andy Warren

SQLServerCentral.com

Blogs

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I'd like to talk about social networking. We'll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let...

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-17

1,530 reads

Speaking at Community Events - More Thoughts

by Andy Warren

SQLServerCentral.com

Blogs

Last week I posted Speaking at Community Events - Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I've got a few more thoughts on the topic this week, and I look forward to your comments.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2009-02-13

360 reads

Exam Prep 70-463: Tables and Schemas

Tables

Schemas

Summary

Rate

Share

Share

Rate

Exam Prep 70-463: Tables and Schemas

Tables

Schemas

Summary

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts