A Check and Foreign Key Constraint Improves Query Performance

  • Contrary to popular believe on this thread FKs CAN affect performance when the referencing table is very large and you perform a large number of writes!

    Second, Replication normally marks foreign keys as NFR and that automatically makes them untrusted. If you need multiple streams in the replication flow you will have no choice but to make them untrusted.

    third BCP does not RESPECT that unless you force it to and if you have necessary permisions to run bcp it will make these untrusted.

    There is more than meets the eye when it come to FK. They are deinitely an integrity component but they do degrade write performance.


    * Noel

  • The SQLCat team has published a top 10 list of best practices for building-a-large-scale-relational-data-warehouse in which they recommend against checking foreign keys on large fact tables. (recommendation 4, last point)

    They don't really provide any explanation for that recommendation though, and I'm wondering if anyone can help flesh that out in light of this article, which pretty much matched my understanding.

    Thanks!



    Dan Guzman - Not the MVP (7/22/2010)
    All questions have to be prefaced by Server version and 'according to MS Docs' or 'my own personal opinion based on how much detail I felt like digging into at the time.'

  • This is interesting and it certainly makes sense. But this has only shown improvements in select statements. How about insert/update statements. I would imagine that foreign keys will actually reduce performance in that case.

    What I am trying to say is that the conlusion made by this article is valid only in some situations.

  • This was a very good post 🙂

  • pslavik (10/21/2010)


    This is interesting and it certainly makes sense. But this has only shown improvements in select statements. How about insert/update statements. I would imagine that foreign keys will actually reduce performance in that case.

    What I am trying to say is that the conlusion made by this article is valid only in some situations.

    Most databases I know of are read heavy so that is not of much concern. Besides even enabled, non-trusted foreign key constraints have enforced referential integrity and thus carry a performance penalty without having the performance benefits of a trusted foreign key constraint to offset this cost.

    And it is very hard to argue in favor of a database without any referential integrety at all. Thus if you do have foreign key constraints to enforce correctness, you better keep them trusted for best performance.

  • Excellent article. Clearly articulated and great examples. Thanks.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • weitzera (10/21/2010)


    The SQLCat team has published a top 10 list of best practices for building-a-large-scale-relational-data-warehouse in which they recommend against checking foreign keys on large fact tables. (recommendation 4, last point)

    They don't really provide any explanation for that recommendation though, and I'm wondering if anyone can help flesh that out in light of this article, which pretty much matched my understanding.

    Thanks!

    The fact and dimension tables in a data warehouse are going to be generated from another data source, which presumable IS guaranteeing the data integrity of the data. So assuming your data sources and transformations are correct, these FKs are areguably, unnecessary, and one of the main points of the dimension approach is denormalization for performance.

  • Excellent Article.Learnt a lot in 5 minutes of reading this.

  • The points being made about the effect of FK of performance depending on the operation are valid. And the exists clause that article's first example shows as faster with the FKs is frankly, idiotic in light of the FK (it's a little like adding a clause to your query that stipulates that an int field is in fact an int!). What that example is really demonstrating is that it is more efficient to use FKs than to add integrity checks into your queries that take the place of FKs, which shouldn't be very surprising. Does it really need to be argued and demonstrated that it is better to ensure data integrity when you store the data than to do integrity checks every time you use the data?

    However, unless you are doing something very specialized, it also would be pretty idiotic to not be using FKs due to performance worries. That's rather like leaving error checking out of your code for performance worries (oh gee, this != null check might cost me a few nanoseconds!). Or writing all your code as one big method because you are worried about the performance of method calls and parameter passing.

    And if you really are in some specialized situation where you are trying to wring every last iota of performance out of an operation, then worrying about FK constraints is still premature unless you have done testing and seen how they impact your specific situation.

  • Douglas Osborne-456728 (10/21/2010)


    All,

    I came up with this little trick - run the query

    SELECT Name AS [Constraint], object_name(parent_object_id) AS [Table]

    FROM sys.foreign_keys

    WHERE Is_Not_Trusted = 1

    then run this

    SELECT 'ALTER TABLE ' + OBJECT_NAME( Parent_Object_ID ) + ' WITH CHECK CHECK CONSTRAINT ' + Name AS [SQL to Execute]

    FROM sys.foreign_keys

    WHERE Is_Not_Trusted = 1

    Cut and paste the results back into query analyzer and execute - note any errors - huge fix.

    Great article!

    Here's a slightly different version I came up with. Uncomment the last command to actually change the constraints (first part only evaluates):

    --FK Violations

    --If any errors are returned, investigate before changing constraints

    --See: http://stackoverflow.com/questions/1098554/sql-server-how-to-make-server-check-all-its-check-constraints

    USE mydatabase;

    --Which FK constraints aren't trusted? (i.e., they were created with NO CHECK)

    --============================================================================================

    SELECT object_name(parent_object_id) As TableName, *

    FROM sys.foreign_keys

    WHERE IS_NOT_TRUSTED = 1

    ORDER BY TableName, [name]

    --Check entire DB for FK constraint violations, whether enabled or not, whether trusted or not (takes a bit of time to run)

    --============================================================================================

    --DBCC CHECKCONSTRAINTS WITH ALL_CONSTRAINTS

    --Fix all FK constraints in DB to be trusted

    --============================================================================================

    /*

    Note that the 2nd "CHECK" is not a typo.

    Think of it as these 2 phrases: "WITH CHECK" to force a check of existing data against the constraint, followed by

    "CHECK CONSTRAINT" to tell SQL we're altering a check constraint (or all of them).

    */

    --EXEC sp_msforeachtable 'ALTER TABLE ? WITH CHECK CHECK CONSTRAINT ALL'

    HTH,

    Rich

  • Rich,

    Why would you just randomly run that against ALL tables? Mine was targeted to the FKs marked as untrusted.

    Wondering,

    Doug

  • Douglas Osborne-456728 (10/22/2010)


    Rich,

    Why would you just randomly run that against ALL tables? Mine was targeted to the FKs marked as untrusted.

    Wondering,

    Doug

    Hmmm, good point. I was thinking that if the constraint was already trusted, that the ALTER command would be skipped. But that's a lazy (and potentially time-consuming) assumption on my part. As I only did this once on a database, it wasn't really an issue for me (I did this when I was down for maintenance).

    This arose for a decidedly non-random reason BTW: a 3rd party app was executing several stored procedures very slowly. Turns out there were hundreds of untrusted FK constraints that were created NOT FOR REPLICATION.

    Unfortunately, it also turns out that you cannot, apparently, make a constraint trusted if it was created NOT FOR REPLICATION (I posted a while ago about this http://www.sqlservercentral.com/Forums/Topic944850-146-1.aspx). You have to DROP and CREATE the constraint.

    While looking up info, I came across this post that suggests BOL may be inaccurate about this: http://connect.microsoft.com/SQLServer/feedback/details/585467/with-check-constraint-can-leave-table-with-untrusted-foriegn-keys.

    Yours,

    Rich

  • Rich,

    Nice add - I didn't know that NOT FOR REPLICATION would do that.

    This has been a really good article and an extensive followup discussion.

    Best,

    Doug

  • Interesting article! Thanks for taking the time to put it together.

    Regards,

    --Mike

  • I agree. You could just write

    select so.* from SalesOrderdetail as so

    and get the same result. (with the not null model)

    That code would only be written if the developer did not know the data structure, or did not TRUST it. At least the cost of that distrust is very low (optimizer time only).

    I think the cost of a FK constraint is in the insert and update side. There is a savings when the developer does not have to worry about detail rows that don't join to the header. That saving exists even without a FK constraint, if the data entry system is trusted to ensure it. But I would tend to constrain the data to ensure that the data entry code continues to work properly.

    I think the conclusion that constraints improve performance is not sufficiently proven here.

    But this article brought up several good points: 1) Making a FK null where the PK is not null will allow null values to be added that are not constrained. That is an interesting point. It is a good practice to keep the pk and FK are the same, even down to the null/not null constraint.

    2) Check for untrusted constraints and correct them, as several of us have already done.

    Thank you for a stimulating and informative article.

Viewing 15 posts - 31 through 45 (of 92 total)

You must be logged in to reply to this topic. Login to reply