Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 1234»»»

Foreign key constraints in data warehouses Expand / Collapse
Author
Message
Posted Thursday, February 7, 2008 12:20 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, November 20, 2012 4:22 PM
Points: 21, Visits: 90
I am just learning about data warehousing and business intelligence, and reading Ralph Kimball’s excellent book “The Data Warehouse Toolkit, Second Edition”.

In dimensional modeling, many of the fact table fields are foreign keys that link to the primary keys of dimension tables. My question is, when the data warehouse database is implemented in SQL Server 2005, is there any point to adding SQL Server foreign key constraints between the fact and the dimension tables? The data should have already been cleansed in the loading process, and adding foreign key constraints doesn’t automatically index the foreign key fields – the fields have to be indexed separately. It seems that foreign key constraints might add complexity, slow processing, and provide no benefit.

Any opinions?
Post #452895
Posted Thursday, February 7, 2008 12:54 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Thursday, July 17, 2014 7:55 AM
Points: 2,805, Visits: 3,067
The cleansing routine is done during the ETL process. After the data loaded into the dimension table, it is not guaranteed that no one would delete it.
To put a foreign key in the fact table is used for data integrity.
Post #452912
Posted Friday, March 7, 2008 2:17 PM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Today @ 11:43 AM
Points: 801, Visits: 1,975
I understand your point about the data cleansing. Why should it be necessary? One, it will help analysis services when you import the tables because it will see the relationships. Two, just in case something is wrong somewhere, the FKs might show it. I recently had a case where having FKs showed an issue that would not have shown up otherwise.


Post #466134
Posted Saturday, March 8, 2008 11:48 PM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Sunday, July 3, 2011 2:20 AM
Points: 112, Visits: 77
I agree that you should always add a FK in your data warehouse.

My other consideration is that you can always use the benefits of index between FK column in your fact table and PK in your dimension table if you want to drill through the data
Post #466370
Posted Sunday, March 9, 2008 10:36 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Tuesday, November 25, 2008 4:37 AM
Points: 4, Visits: 32
During ETL, loading Facts we always do a lookup with the dimensions to ensure that data gets filtered. In other words fact tables will contain only that data whose reference is present in dimension tables. Adding primary key foreign key relationship will decrease the response time
Post #466471
Posted Sunday, March 9, 2008 10:56 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Thursday, June 5, 2014 10:54 AM
Points: 9,902, Visits: 9,480
Chuck Bevitt (2/7/2008)
My question is, when the data warehouse database is implemented in SQL Server 2005, is there any point to adding SQL Server foreign key constraints between the fact and the dimension tables?


I wouldn't do it. Foreign Keys are really antithetical to the purpose and philosophy of OLAP and data warehouses.


-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
Post #466476
Posted Monday, March 10, 2008 9:09 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Today @ 11:43 AM
Points: 801, Visits: 1,975
During ETL, loading Facts we always do a lookup with the dimensions to ensure that data gets filtered. In other words fact tables will contain only that data whose reference is present in dimension tables. Adding primary key foreign key relationship will decrease the response time


I suppose that for some downloads there would be no change to the dimension tables, but I haven't had that experience. There are always new dimension facts to download too, such as new technicians or new zip codes where the work is being done. There is the chance that something will go wrong with those downloads, in which case, you wouldn't want the fact table to also succeed. FKs prevent that second download from succeeded.

There could be a case where the download time performance is absolutely crucial, but most downloads occur in the middle of the night, and 10 or 20 extra seconds isn't going to matter.



Post #466744
Posted Monday, March 10, 2008 1:07 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Thursday, June 5, 2014 10:54 AM
Points: 9,902, Visits: 9,480
RonKyle (3/10/2008)
[quote]I suppose that for some downloads there would be no change to the dimension tables, but I haven't had that experience. There are always new dimension facts to download too, such as new technicians or new zip codes where the work is being done. There is the chance that something will go wrong with those downloads, in which case, you wouldn't want the fact table to also succeed. FKs prevent that second download from succeeded.

There could be a case where the download time performance is absolutely crucial, but most downloads occur in the middle of the night, and 10 or 20 extra seconds isn't going to matter.


In general, I think that it is better to implement ETL features in the ETL itself and keep them out of the OLAP. And that is a large part of the whole philosophy of having an OLAP database that is separate from your OLTP database.

Also, I have been in many environments where my customer was pushing the limits of their 4 to 6 hour ETL window and there have not been many cases where one FK would have added only 10 to 20 seconds, let alone one or more for every dimension.


-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
Post #466898
Posted Monday, March 10, 2008 1:09 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Thursday, June 5, 2014 10:54 AM
Points: 9,902, Visits: 9,480
Loner (2/7/2008)
The cleansing routine is done during the ETL process. After the data loaded into the dimension table, it is not guaranteed that no one would delete it.
To put a foreign key in the fact table is used for data integrity.


Users and Apps should not be deleting records or making any other permanent modifications to an OLAP database. Only the ETL should be doing that.


-- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoung
Proactive Performance Solutions, Inc.
"Performance is our middle name."
Post #466900
Posted Tuesday, March 11, 2008 2:05 AM
SSC-Enthusiastic

SSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-EnthusiasticSSC-Enthusiastic

Group: General Forum Members
Last Login: Sunday, July 3, 2011 2:20 AM
Points: 112, Visits: 77
Since data warehouse is not a mission critical, I completely agree with Ron. Added a FK would not decrease a process at all, because we do it in a batch. No users involved in this process

Also imagine when you had to drillthrough the data. FK will speed up the process.

Post #467220
« Prev Topic | Next Topic »

Add to briefcase 1234»»»

Permissions Expand / Collapse