Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Would you store foreign key from grandparent tables? Why or why not? Expand / Collapse
Author
Message
Posted Tuesday, November 27, 2012 6:12 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 10:50 AM
Points: 76, Visits: 435
I'm curious about potential ramifications of storing a foreign key that is derivable.

For example, consider that we have traditional 3 tables; Customers, Orders and Order Details with appropriate foreign key constraints in place. We could want to filter all Order Details rows using Customer primary key. Customer key can be derived via the intervening Orders table so there is no need to store the Customer key in the Order Details table.

In most scenarios, we usually would be joining all grandparent, parent and the child tables (e.g. we usually want the customer's name, order's date and then the line items) so this is usually a moot point. However, suppose there is a legitimate need to filter the child table based on grandparent's keys without joining the parent table in. A possible example is when you have a Customer -> Account -> Order and for large part of the application, you just need to correlate Customer to Order and not be concerned about Account details. In theory, storing & indexing Customer's keys in the Order table would save us from joining in the intervening Account table and should be a bit faster.

Has this worked out in practice? If you've done it before, how did it work out? If not, why not?

Sample T-SQL illustrating possible schema for enforcing grandparent keys in the child table is provided.

EDIT: Updated the script to include the ON UPDATE CASCADE option on child's foreign key.
SET QUOTED_IDENTIFIER ON;
SET ANSI_NULLS ON;

GO

CREATE DATABASE demo;

GO

USE demo;

GO

CREATE TABLE dbo.GrandParent(
ID INT CONSTRAINT PK_GrandParent PRIMARY KEY
);

CREATE TABLE dbo.Parent(
ID INT CONSTRAINT PK_Parent PRIMARY KEY,
GrandParentID INT NOT NULL CONSTRAINT FK_Parent_GrandParentID FOREIGN KEY REFERENCES dbo.GrandParent(ID),
CONSTRAINT UQ_Parent_ID_GrandParentID UNIQUE (ID, GrandParentID)
);

CREATE TABLE dbo.Child(
ID INT CONSTRAINT PK_Child PRIMARY KEY,
GrandParentID INT NOT NULL,
ParentID INT NOT NULL,
CONSTRAINT FK_Child_Parent_GrandParent FOREIGN KEY (ParentID, GrandParentID) REFERENCES dbo.Parent(ID, GrandParentID) ON UPDATE CASCADE
);

GO

INSERT INTO dbo.GrandParent(ID) VALUES
(1),
(2),
(3)
;

INSERT INTO dbo.Parent(ID, GrandParentID) VALUES
(1, 1),
(2, 1),
(3, 1),
(4, 2),
(5, 2),
(6, 2),
(7, 3),
(8, 3),
(9, 3)
;

INSERT INTO dbo.Child(ID, GrandParentID, ParentID) VALUES
(1, 1, 1),
(2, 1, 2),
(3, 1, 3),
(4, 2, 4),
(5, 2, 5),
(6, 2, 6),
(7, 3, 7),
(8, 3, 8),
(9, 3, 9),
(10, 1, 1),
(11, 1, 2),
(12, 1, 3),
(13, 2, 4),
(14, 2, 5),
(15, 2, 6),
(16, 3, 7),
(17, 3, 8),
(18, 3, 9),
(19, 1, 1),
(20, 1, 2),
(21, 1, 3),
(22, 2, 4),
(23, 2, 5),
(24, 2, 6),
(25, 3, 7),
(26, 3, 8),
(27, 3, 9),
(28, 1, 1),
(29, 1, 2),
(30, 1, 3),
(31, 2, 4),
(32, 2, 5),
(33, 2, 6),
(34, 3, 7),
(35, 3, 8),
(36, 3, 9)
;

GO

--Should fail; violates contrsaint
INSERT INTO dbo.Child(ID, GrandParentID, ParentID) VALUES
(37, 1, 4);

GO

--Should fail; key is in use
UPDATE dbo.Parent
SET ID = 10
WHERE ID = 2;

GO

--Should succeed, child's grandparentkey will be updated
UPDATE dbo.Parent
SET GrandParentID = 3
WHERE ID = 2;

GO

--Should succeed; changing the parent but not grandparent
UPDATE dbo.Child
SET ParentID = 2
WHERE ID = 1;

GO

--Should fail; not valid ParentID for the given Grandparent
UPDATE dbo.Child
SET ParentID = 4
WHERE ID = 1;

GO

--Should succeed, changing both parent and grandparent
UPDATE dbo.Child
SET ParentID = 4,
GrandParentID = 2
WHERE ID = 1;

GO

--Should fail; can't change GrandParentID without changing ParentID
UPDATE dbo.Child
SET GrandParentID = 3
WHERE ID = 1;

GO

USE master;

GO

DROP DATABASE demo;

GO

Post #1389142
Posted Tuesday, November 27, 2012 8:27 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 7:36 AM
Points: 13,139, Visits: 11,979
I would highly recommend NOT doing that. It will cause you untold pain in the future. You are proposing to violate normal form which will cause you far more pain than it will save. There is a reason you identify these tables as grandparent -> parent -> child. You will end up with a total mess on your hands. You are denormalizing your database to make it easier for queries. This comes at the cost of data integrity. Just my 2¢.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1389227
Posted Tuesday, November 27, 2012 8:35 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 10:50 AM
Points: 76, Visits: 435
I'd have agreed if the updates required additional work (e.g. manually updating grandparent key in both parent and child table) but the schema I demonstrated does that via ON UPDATE CASCADE and UNIQUE constraint on the parent table and therefore requires no additional coding. Since this is pure SQL with no triggers or other manual coding, I'm less than sure that this is a clear-cut case of "bad" denormalizing, which is why I asked.

Can you demonstrate where data integrity will be broken in this schema?
Post #1389238
Posted Tuesday, November 27, 2012 10:21 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 7:36 AM
Points: 13,139, Visits: 11,979
Well your constraints do prevent bad data. It certainly makes updates more of a pain to deal with. I guess the issue really is denormalization. There are certainly times when this is appropriate. I can't really hazard a guess if your situation is one of those or not because the sample tables are so simple. My guess is that because you are asking the question it probably is not the best way to proceed. Typically if something doesn't seem right, it probably isn't.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Moden's splitter.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs
Understanding and Using APPLY (Part 1)
Understanding and Using APPLY (Part 2)
Post #1389304
Posted Tuesday, November 27, 2012 11:59 AM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, July 20, 2014 7:16 AM
Points: 945, Visits: 1,769
ill add a little of my view to sean's. in an OLTP (Transaction) system i would say it would be very limited where i would de-normalize, i would not recommend it as a first choice tool. there are times (and in my current job before i got here they de-normalized the structure) where it can make sense but that determination is probably not going to be made through a forum thread.


For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw

Need to Split some strings? Jeff Moden's DelimitedSplit8K
Jeff Moden's Cross tab and Pivots Part 1
Jeff Moden's Cross tab and Pivots Part 2

Jeremy Oursler
Post #1389359
Posted Tuesday, November 27, 2012 3:45 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 10:50 AM
Points: 76, Visits: 435
Sean:

I certainly do agree that we're doing more work on the inserting new rows to Child tables and that also allows for more errors (e.g. trying to insert wrong grandparent ID due to a bug in the application).

However, I hope you forgive me for saying that but I was hoping for more than "nobody has done it so it must be wrong". After all, it didn't stop Codd when he first formalized what we now know as normalization. ;)

Normalization are quite clear-cut because we can see the negative consequence of update anomalies created when normalization is violated. The only reason to denormalize is really due to physical limitations and even so it's mightily an iffy proposition in OLTP realm. For the above schema, the only valid reason (to me at least) is to be able to create an index on the Child that correlates to the GrandParent records and having enough of cardinality to make a difference... in theory.

For example, we could compare those queries:

Traditional approach:
SELECT c.Customer, o.*
FROM dbo.Customers AS c
INNER JOIN dbo.Accounts AS a
ON c.ID = a.CustomerID
INNER JOIN dbo.Orders AS o
ON a.ID = o.AccountID
WHERE c.Inactive = 0
AND c.CustomerType = 1;

Alternative approach:
SELECT c.Customer, o.*
FROM dbo.Customers AS c
INNER JOIN dbo.Orders AS o
ON c.ID = o.CustomerID
WHERE c.Inactive = 0
AND c.CustomerType = 1;

Under traditional approach, we'd need to use indices on Order's AccountID and on Account's CustomerID to effectively process the results whereas the alternative approach would use only single index. That would certainly mean less pages to access in returning the results, right? Or am I missing factors/other details that may make this unnecessary or overstating the benefits?



capnhector:

I absolutely agree that denormalization shouldn't be the first choice. My intention was primarily to find out if someone else thought of that already and see what experience was like. Sean's suggestion that having to asking suggests that it may be wrong is typically correct but I didn't just want to simply assume. :)
Post #1389441
Posted Tuesday, November 27, 2012 4:11 PM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, July 20, 2014 7:16 AM
Points: 945, Visits: 1,769
Traditional approach:
SELECT c.Customer, o.*
FROM dbo.Customers AS c
INNER JOIN dbo.Accounts AS a
ON c.ID = a.CustomerID
INNER JOIN dbo.Orders AS o
ON a.ID = o.AccountID
WHERE c.Inactive = 0
AND c.CustomerType = 1;

Alternative approach:
SELECT c.Customer, o.*
FROM dbo.Customers AS c
INNER JOIN dbo.Orders AS o
ON c.ID = o.CustomerID
WHERE c.Inactive = 0
AND c.CustomerType = 1

;


We do something similar to the above. if you have alot of queries that make that join it does justify testing the improvement of all of the queries. in our case we found a large improvement in overall execution time of the queries we were able to simplify (executed several million times a day 2-3 ms makes a huge difference over a day). so if you can justify it with performance testing and showing an improvement i dont think there would be an issue.

The question then becomes why were AccountID and CustomerID seperate to begin with? Do you need both in the orders table?



For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw

Need to Split some strings? Jeff Moden's DelimitedSplit8K
Jeff Moden's Cross tab and Pivots Part 1
Jeff Moden's Cross tab and Pivots Part 2

Jeremy Oursler
Post #1389456
Posted Tuesday, November 27, 2012 5:44 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Tuesday, July 29, 2014 10:50 AM
Points: 76, Visits: 435
capnhector -

In this hypothetical example, it was assumed that a customer could have several accounts and orders are related to accounts for budgeting purposes. In this situation, we shouldn't store CustomerID in the Orders table because it's already implied by AccountID.

You've indicated that in some situations, being able to skip a join can make a big difference provided that we can back the specific scenario with a performance test. Did you include the unique constraint & update cascade?
Post #1389474
Posted Tuesday, November 27, 2012 6:07 PM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Sunday, July 20, 2014 7:16 AM
Points: 945, Visits: 1,769
we did not include the unique constraint and do not foreign key to the accounts table on CustomerID, AccountID from the orders table. (all tables changed to reflect the previously posted queries). we did however leave the AccountID and just added the CustomerID column to the orders table with a foreign key back to the customers table. and while this could lead to an update error our business rules are written that the update is never made.


For faster help in answering any problems Please read How to post data/code on a forum to get the best help - Jeff Moden for the best way to ask your question.

For performance Issues see how we like them posted here: How to Post Performance Problems - Gail Shaw

Need to Split some strings? Jeff Moden's DelimitedSplit8K
Jeff Moden's Cross tab and Pivots Part 1
Jeff Moden's Cross tab and Pivots Part 2

Jeremy Oursler
Post #1389478
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse