Trust, or Verify – How FKs Can (or Cannot) Help Performance

Question

Trust, or Verify – How FKs Can (or Cannot) Help Performance

Hugo Kornelis

SSC Guru

Points: 64780
More actions
March 7, 2016 at 11:57 pm

#311966

Comments posted to this topic are about the item Trust, or Verify – How FKs Can (or Cannot) Help Performance
Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

Gabriel P SSCarpal Tunnel Points: 4289 More actions · Answer 1

If you know referential integrity is being enforced, why query the Departments table at all if you're not returning any data from it? If you are not returning any columns from the departments table the only reason it's included in the JOIN is to verify that the foreign key exists in the Departments table...but the foreign key constraint already enforces that. So why even include the Departments table in the query at all?

Maybe I missed a detail or haven't come across a situation where this might be necessary, but it seems if you have the power to comment out all fields from the Department table in the SELECT clause, you have the power to comment out the Department table in the FROM clause. :hehe:

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 2

Gabriel P (3/8/2016)
If you know referential integrity is being enforced, why query the Departments table at all if you're not returning any data from it? If you are not returning any columns from the departments table the only reason it's included in the JOIN is to verify that the foreign key exists in the Departments table...but the foreign key constraint already enforces that. So why even include the Departments table in the query at all?
Maybe I missed a detail or haven't come across a situation where this might be necessary, but it seems if you have the power to comment out all fields from the Department table in the SELECT clause, you have the power to comment out the Department table in the FROM clause. :hehe:

... but not if the FROM clause is embedded in a view. Hugo did acknowledge that the first query wasn't particularly real-world, which is why he gave the example of the view. If you select from a view, you don't necessarily know which tables the columns are coming from.

John

CGSJohnson SSCertifiable Points: 7208 More actions · Answer 3

Nice article, Hugo! Very clearly and well written!

Thank...Chris

Gabriel P SSCarpal Tunnel Points: 4289 More actions · Answer 4

Got it i had a feeling I was missing something. Need my morning coffee :w00t:

BarneyL Ten Centuries Points: 1044 More actions · Answer 5

John Mitchell-245523 (3/8/2016)
Gabriel P (3/8/2016)
If you know referential integrity is being enforced, why query the Departments table at all if you're not returning any data from it? If you are not returning any columns from the departments table the only reason it's included in the JOIN is to verify that the foreign key exists in the Departments table...but the foreign key constraint already enforces that. So why even include the Departments table in the query at all?
Maybe I missed a detail or haven't come across a situation where this might be necessary, but it seems if you have the power to comment out all fields from the Department table in the SELECT clause, you have the power to comment out the Department table in the FROM clause. :hehe:
... but not if the FROM clause is embedded in a view. Hugo did acknowledge that the first query wasn't particularly real-world, which is why he gave the example of the view. If you select from a view, you don't necessarily know which tables the columns are coming from.
John

It comes up a lot for me in data warehousing. Analysts frequently request views which join all the relevant dimensions to a fact table and then use this as a starting point to save them having to write the joins each time. They won't use all the columns for a single query but they could use any of them eventually. Performance for these kinds of views aren't great unless you have foreign keys set up so the unused dimensions are ignored.

jghohengarten Old Hand Points: 306 More actions · Answer 6

Perhaps you could add to your article a query that finds untrusted FK's, also maybe even to dynamically generate the WITH CHECK statements?

Rich Mechaber SSChampion Points: 11130 More actions · Answer 7

jghohengarten (3/8/2016)
Perhaps you could add to your article a query that finds untrusted FK's, also maybe even to dynamically generate the WITH CHECK statements?

--Which FK constraints aren't trusted? (e.g., they were created with NO CHECK)

--============================================================================================

SELECT object_name(parent_object_id) As TableName, *

FROM sys.foreign_keys

WHERE IS_NOT_TRUSTED = 1

ORDER BY TableName, [name];

And this will force a check on every constraint on every table in the database:

EXEC sp_msforeachtable 'ALTER TABLE ? WITH CHECK CHECK CONSTRAINT ALL';

Rich

Rich Mechaber SSChampion Points: 11130 More actions · Answer 8

Nice article Hugo, and I like your inclusion of execution plans to show how untrusted foreign keys can affect query plans.

A nice, small thing you did that will probably help out people new to this concept (and which I mention here because it was confusing to me when I found out about untrusted keys): you put the doubled "CHECK" on 2 separate lines in your code to clarify reading and, I bet, to emphasize that it was not a typo:

-- Re-enable the foreign key

ALTER TABLE dbo.Projects

WITH CHECK

CHECK CONSTRAINT FK_Projects_Departments;

When asked, I've explained it this way: it's 2 separate phrases: "WITH CHECK" to force a check of existing data against the constraint, followed by "CHECK CONSTRAINT" to tell SQL we're altering a check constraint.

Rich

AZJim Default port Points: 1432 More actions · Answer 9

Hugo ... nice article. Thanks.

I guess what I am about to say is akin to stating a religious dogma (or opposing a religious dogma). Nothing seems as contentious among DBAs and developers as the topic of declarative RI.

One area that gets overlooked with RI is locking on the logical parent for deletes and inserts. If your database design employs declarative RI, please take this into consideration. For example, if you are attempting to insert a large number of logical child rows (i.e., with the same logical parent FK value), you will be continuously locking that parent table's row or table's data page. If that logical parent row is subject to updates, you will have contention, maybe even significant contention.

You can often find this type of problem in database designs where a transaction that updates as it completes each of its processing "states" (very typical in the credit card industry). There are design alternatives to this problem, but if the database is already designed, the best you can do is eliminate any ad hoc SQL writes (INSERT, UPDATE, DELETE) and only write to the database using an application. If your shop won't allow you to lock down the database from ad hoc writes, then you have no choice but to employ declarative RI.

Just to show I am not too dogmatic, declarative RI has definite performance benefits for cascade deletes. If that is what you are intending to do, by all means exploit the performance gain you can get from this.

Alan Burstein SSC Guru Points: 61136 More actions · Answer 10

Really great work Hugo.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Rob Schripsema SSCertifiable Points: 7469 More actions · Answer 11

Outstanding article, Hugo. Very clear examples, both the overly simple one to illustrate the problem and the more complex one to show a real world scenario.

Hartelijk bedankt!

Rob Schripsema
Propack, Inc.

Hugo Kornelis SSC Guru Points: 64780 More actions · Answer 12

Thanks for all the feedback and compliments, people!

And sorry for being off the radar just when this article was published.

@gabriel: I see that John already replied to your question. Thanks, John!

@jghohengarten: Good suggestion, that unfortunately didn't cross my mind when writing. Not sure if I would have if I had thought about it, because the target was to write a short (well, by my standards) article. I am happy to see that Rich replied with a script to find untrusted constraints. He addresses the sys.foreign_keys view, if you change it to look at the sys.check_constraints view you will find untrusted CHECK constraints as well.

Be careful with the code Rich posted to force a check on every constraint in the database. It will do exactly what it says on the cover. If you have a multi-GB table with lots of constraints and a few of them are untrusted, then running Rich's script will force a full check of ALL constraints, not just the untrusted ones. Probably a huge overkill that will consume huge amounts of resources.

@rich: I like your suggestion to put a line break between CHECK and CHECK. Let's try if I can retrain my muscle memory to do it like that going forward.

@AZJim: You are right that locks are taken when checking referential integrity, and this can cause issues. However, I rather deal with those issues than with the issues caused by constraint violations. The constraint check is done using a pure read, so the lock will be an S (shared) lock, and released immediately after checking (even when you have set the transaction isolation level to be higher than the default), so it should not pose a big issue in most databases.

I actually dislike cascading deletes and cascading updates. They can wreack havoc in a database if improperly used. But even more important, they do not allow control over the order in which exclusive locks are taken, which means taht it can become very hard to avoid deadlocks.

To everyone else: Thanks much for the kind words!

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Sergiy SSC Guru Points: 110206 More actions · Answer 13

Is the described behaviour version dependent?

I replicated the tables, enforced and checked the FK, and I still have Clustered Index Scan on Projects, Clustered Index Seek on Departments and Nested Loop (inner join).

Same as when I drop the FK completely.

The only way to get rid of departments is to use LEFT JOIN.

I'm testing it on

Microsoft SQL Server 2008 R2 (SP2) - 10.50.4000.0 (X64)

Jun 28 2012 08:36:30

Copyright (c) Microsoft Corporation

Express Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)

_____________
Code for TallyGenerator

Hugo Kornelis SSC Guru Points: 64780 More actions · Answer 14

Sergiy (3/9/2016)
Is the described behaviour version dependent?
I replicated the tables, enforced and checked the FK, and I still have Clustered Index Scan on Projects, Clustered Index Seek on Departments and Nested Loop (inner join).
Same as when I drop the FK completely.
The only way to get rid of departments is to use LEFT JOIN.
I'm testing it on
Microsoft SQL Server 2008 R2 (SP2) - 10.50.4000.0 (X64)
Jun 28 2012 08:36:30
Copyright (c) Microsoft Corporation
Express Edition (64-bit) on Windows NT 6.1 <X64> (Build 7601: Service Pack 1)

That's very weird. I have tested on SQL Server 2012 and on SQL Server 2008 (not R2) and got consistent results on both. I do not know if this specific optimization has always been in the product or if it hasn't when it was introduced, but I do know from my tests that it existed in SQL Server 2008.

I cannot explain why you didn't get the expected results on 2008R2. If I can find a 2008R2 instance where I can test this, I will check what happens.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/