A Check and Foreign Key Constraint Improves Query Performance

  • ps. (10/21/2010)


    any idea how do we check the same in sql server 2000? sysforeignkeys doesn't have is_trusted column..

    I'm pretty sure that SQL 2000 doesn't have that same capability. I just disabled a foreign key constraint on one of our SQL 2000 DB, confirmed that it scanned both tables then re-enabled it and without any other action it just scanned the table that I was selecting from.

  • Great article. Didn't understand example of 'TeamID' in the last part. Did you mean 'CustomerID'?

  • cfradenburg (10/21/2010)


    garima.arya (10/21/2010)


    If someone has added a foreign key constraint, why would he write a query like this.

    Interesting point about how the query optimizer works but I'm more with Garima. If there's a foreign key constraint there what's the advantage of putting the exists check in the code? The only thing I can think of is that it will make it a little easier to follow if you're not familiar with the DB but I wouldn't think that would be worth giving the optimizer the possibility of generating an execution plan that will scan both tables.

    I think that trusted foreign key constraints provide information that can be used to cut out more of an execution plan then just the simple exists used in this article. It is fact that multiple trusted references to the same target table signify that the data in those foreign key fields is directly comparable to eachother. So the query optimizer can figure this out and simplify execution plans.

    As for the arguments that this only involves queries written without build in knowledge by the writer, consider views that server multiple purposes. For views it is accepted that the optimizer has to cut out unused parts based on relevance to the consuming query. The writer of the consuming query might have the knowledge, but not the means to do this for the optimizer in this instance, nor does the writer of the view.

  • Here is a query I wrote for SQL Server 2008 and its problably good for 2005 as well. For every untrusted foreign key it generates a small script of either two or four commands, here is what they do:

    1. Find records that are in conflict with the foreign key constraint definition.

    2. Enable and re-check the foreign key constraint (only works when 'not for replication' was not used).

    Optional, when 'not for replication' was used during constraint creation.

    3. A drop statement for the foreign key.

    4. A create statement for the foreign key, this time without 'not for replication'.

    set nocount on

    ;

    with

    fkCompleteQ as

    (

    select

    is_not_for_replication = fk.is_not_for_replication

    , parent_schema_id = fk.schema_id

    , parent_object_id = fk.parent_object_id

    , referenced_object_id = fk.referenced_object_id

    , referenced_schema_id = rt.schema_id

    , object_id = fk.object_id

    , delete_referential_action_desc = replace( fk.delete_referential_action_desc, '_', ' ' )

    , update_referential_action_desc = replace( fk.update_referential_action_desc, '_', ' ' )

    , parent_fields = stuff( ( select ', [' + cp.name + ']' from sys.foreign_key_columns as fkc inner join sys.columns as cp on cp.object_id = fkc.parent_object_id and cp.column_id = fkc.parent_column_id where fkc.constraint_object_id = fk.object_id order by fkc.constraint_column_id for xml path( '' ) ), 1, 2, '' )

    , referenced_fields = stuff( ( select ', [' + cr.name + ']' from sys.foreign_key_columns as fkc inner join sys.columns as cr on cr.object_id = fkc.referenced_object_id and cr.column_id = fkc.referenced_column_id where fkc.constraint_object_id = fk.object_id order by fkc.constraint_column_id for xml path( '' ) ), 1, 2, '' )

    , join_fields = stuff( ( select ' and ( p.[' + cp.name + '] is null or p.[' + cp.name + ']=r.[' + cr.name + '] )' from sys.foreign_key_columns as fkc inner join sys.columns as cp on cp.object_id = fkc.parent_object_id and cp.column_id = fkc.parent_column_id

    inner join sys.columns as cr on cr.object_id = fkc.referenced_object_id and cr.column_id = fkc.referenced_column_id where fkc.constraint_object_id = fk.object_id order by fkc.constraint_column_id for xml path( '' ) ), 1, 5, '' )

    from

    sys.foreign_keys as fk

    inner join sys.tables as rt on rt.object_id = fk.referenced_object_id

    where

    fk.is_not_trusted = 1 /* and fk.is_disabled = 0 */

    )

    select

    result = '/* constraint: [' + schema_name( fk.parent_schema_id )+ '].[' + object_name( fk.parent_object_id ) + '].[' + object_name( fk.object_id ) + '] */' + char(10)

    + 'select p.* from [' + schema_name( fk.parent_schema_id )+ '].[' + object_name( fk.parent_object_id ) + '] as p where not exists ( select 1 from [' + schema_name( fk.referenced_schema_id )+ '].[' + object_name( fk.referenced_object_id ) + '] as r where ' + join_fields + ');' + char(10)

    + 'alter table [' + schema_name( fk.parent_schema_id )+ '].[' + object_name( fk.parent_object_id ) + '] with check check constraint [' + object_name( fk.object_id ) + '];' + char(10)

    + case

    when is_not_for_replication = 0

    then ''

    else

    '-- alter table [' + schema_name( fk.parent_schema_id )+ '].[' + object_name( fk.parent_object_id ) + '] drop constraint [' + object_name( fk.object_id ) + '];' + char(10)

    + '-- alter table [' + schema_name( fk.parent_schema_id )+ '].[' + object_name( fk.parent_object_id ) + '] add constraint [' + object_name( fk.object_id ) + '] foreign key ( ' + parent_fields + ' ) references [' + schema_name( fk.referenced_schema_id )+ '].[' + object_name( fk.referenced_object_id ) + '] ( ' + referenced_fields + ' ) on DELETE ' + fk.delete_referential_action_desc + ' on UPDATE ' + fk.update_referential_action_desc + ';' + char(10)

    end

    from

    fkCompleteQ as fk

    order by

    fk.parent_schema_id

    , fk.parent_object_id

    , fk.referenced_schema_id

    , fk.referenced_object_id

    , fk.object_id

    , result

    ;

    Execute using text mode output (Ctrl-t) to get the script and copy/paste it to a new query window to examine and possibly execute parts of it. If you find parts of the resuls truncated, go to:

    Tools -> Options -> Query Results -> Results to text -> Maximum number of characters displayed in each column. And set the value to a larger number (2048) should be large enough for this.

    NOTE:

    I modified the code a bit to take into acount optional foreign key fields.

  • I liked the article as it brought into focus an element of the QO that I had not considered before.

    As an aside, I would like to point out that we commonly optimize exists / not exists sub-queries by using left outer joins. I've gotten into the habit now of not using 'exists' sub-queries due to the order of magnitude difference I've seen on some query executions.

    In any case, it is a good article.

  • This behavior of sql server of not checking fk constraints when re-enabling checking shows a woeful lack of concern for database integrity. It boggles the mind. Thanks for pointing it out. Of course the only legitimate use of turning of a fk constraint I can think of is for a case where you might need to drop and recreate the other table for a schema change. The lesson I get from this is stay away from NOCHECK. If I'm not mistaken, this isn't standard SQL anyway.

    BTW it looks like there might be the remnant of some previous version in your article where you used a different schema in the example: at one point you mention "TeamID".

  • peter-757102 (10/21/2010)


    cfradenburg (10/21/2010)


    garima.arya (10/21/2010)


    If someone has added a foreign key constraint, why would he write a query like this.

    Interesting point about how the query optimizer works but I'm more with Garima. If there's a foreign key constraint there what's the advantage of putting the exists check in the code? The only thing I can think of is that it will make it a little easier to follow if you're not familiar with the DB but I wouldn't think that would be worth giving the optimizer the possibility of generating an execution plan that will scan both tables.

    I think that trusted foreign key constraints provide information that can be used to cut out more of an execution plan then just the simple exists used in this article. It is fact that multiple trusted references to the same target table signify that the data in those foreign key fields is directly comparable to eachother. So the query optimizer can figure this out and simplify execution plans.

    As for the arguments that this only involves queries written without build in knowledge by the writer, consider views that server multiple purposes. For views it is accepted that the optimizer has to cut out unused parts based on relevance to the consuming query. The writer of the consuming query might have the knowledge, but not the means to do this for the optimizer in this instance, nor does the writer of the view.

    Lets clarify this with some actual test code (read the comments in the script!!!):

    set nocount on;

    -- cleanup code

    --

    -- drop view dbo.TestViewReferences

    -- drop table dbo.TestReferences

    -- drop table dbo.TestReferenced

    create table dbo.TestReferenced

    (

    Id int not null

    , Data int not null

    , constraint pk_TestReferenced primary key clustered ( Id )

    );

    go

    insert into dbo.TestReferenced( Id, Data ) values ( 1, 10 );

    insert into dbo.TestReferenced( Id, Data ) values ( 2, 20 );

    go

    create table dbo.TestReferences

    (

    Id int not null

    , IdTestReferenced int not null

    , constraint pk_TestReferences primary key clustered ( Id )

    , constraint fk_TestReferences_TestReferenced foreign key ( IdTestReferenced ) references dbo.TestReferenced( Id )

    );

    go

    insert into dbo.TestReferences( Id, IdTestReferenced ) values ( 1, 1 );

    insert into dbo.TestReferences( Id, IdTestReferenced ) values ( 2, 2 );

    go

    create view dbo.TestViewReferences as

    select

    ttr.Id

    , ttr.IdTestReferenced

    , tr.data

    from

    dbo.TestReferences as ttr

    inner join TestReferenced as tr on tr.Id = ttr.IdTestReferenced

    ;

    go

    -- ----------------------------------------------------------------------------------------

    -- Testing against the view starts here and proves the importance of trusted foreign keys

    -- ----------------------------------------------------------------------------------------

    -- On the trusted foreign key, the first query only needs to access table "dbo.TestReferences"

    select Id, IdTestReferenced from dbo.TestViewReferences;

    -- Only when we require data that can only be found in the table "dbo.TestReferenced", will it be accessed.

    select Id, data from dbo.TestViewReferences;

    go

    -- disbable constraint

    alter table dbo.TestReferences nocheck constraint fk_TestReferences_TestReferenced;

    -- re-enable constraint without 'with check', so it becomes "non-trusted"

    alter table dbo.TestReferences check constraint fk_TestReferences_TestReferenced;

    -- for the statement that previously had an optiomal plan, the optimizer can no longer trust a "dbo.TestReferenced"

    -- record exists and will have to make a inner join to make sure. This join is required for a result due to the

    -- inner join with this table.

    select Id, IdTestReferenced from dbo.TestViewReferences;

    go

    -- re-check the already enabled constraint so it becomes trusted again

    alter table dbo.TestReferences with check check constraint fk_TestReferences_TestReferenced;

    -- now this query will access only table "TestReferences" again.

    select Id, IdTestReferenced from dbo.TestViewReferences;

    go

    -- cleanup code

    --

    -- drop view dbo.TestViewReferences

    -- drop table dbo.TestReferences

    -- drop table dbo.TestReferenced

  • Oh WOW...I just discovered that 99.9% of our FK's are NOT TRUSTED! Out of 1,297 FK's, only ONE is trusted.

    We use an enterprise software package that is about 10 years old. I'm curious if these developers know something I don't or if they were not aware of this setting. This leads me to a question...

    Should I consider going through each FK, using the WITH CHECK command and setting these back to "trusted"? It seems to be an obvious "yes" but perhaps there are some "gotcha's" to doing this? I wasn't aware of this setting before today so I don't want to charge in and change all these to "trusted" without first finding out if there is a downside to that setting.

    Anyone have some advice?

  • All,

    I came up with this little trick - run the query

    SELECT Name AS [Constraint], object_name(parent_object_id) AS [Table]

    FROM sys.foreign_keys

    WHERE Is_Not_Trusted = 1

    then run this

    SELECT 'ALTER TABLE ' + OBJECT_NAME( Parent_Object_ID ) + ' WITH CHECK CHECK CONSTRAINT ' + Name AS [SQL to Execute]

    FROM sys.foreign_keys

    WHERE Is_Not_Trusted = 1

    Cut and paste the results back into query analyzer and execute - note any errors - huge fix.

    Great article!

  • Great article! Just ran the query against sys.foreign_keys and found that 30% of our foreign_keys are not trusted. =/

    Thanks for the great tip!

  • Great article !

    I only disagree with the conclusion

    Conclusion

    In this article I have showed that check and foreign constraints do not degrade performance but actually improves performance.

    because you've showed that check an foreign constraints do not degrade performance in these querys but not in all querys you may perform. Besides, foreign key constraints actually degrade performance when you delete rows in the "master" table (because it has tu check if any row is used in another table).

    Thanks, Marcos.

    Author of Nautilus - http://sourceforge.net/projects/nautilus/

  • marcosc (10/21/2010)


    Great article !

    I only disagree with the conclusion

    Conclusion

    In this article I have showed that check and foreign constraints do not degrade performance but actually improves performance.

    because you've showed that check an foreign constraints do not degrade performance in these querys but not in all querys you may perform. Besides, foreign key constraints actually degrade performance when you delete rows in the "master" table (because it has tu check if any row is used in another table).

    Thanks, Marcos.

    Author of Nautilus - http://sourceforge.net/projects/nautilus/%5B/quote%5DWhile you are correct on an individual statement level, the author is correct in stating that trusted foreign keys improve performance. Bar few exceptions, a database is more heavily read then written to.

    Your are correct in that he did not prove it tho 😉

    Please do take a look at the previous post I made with SQL code that demonstrates its effect on views that are used partially.

  • Very kewl one... worth mentioning it. thanx. could please post more of these.

  • Peter, I totally agree with you. Databases are much more read than updated and even less delete operation are performed. It was just a small detail in a great article.

    I've discovered great performance tipos watching the excecution plans !

    Thanks, Marcos.

  • Great article. Didn't understand example of 'TeamID' in the last part. Did you mean 'CustomerID'?

    Sorry thats typo. It should be CustomerID.

Viewing 15 posts - 16 through 30 (of 92 total)

You must be logged in to reply to this topic. Login to reply