Problem with creating a Foreign Key

  • Hello,

    I have two tables. One is named GIS_Data, the other is Field_Data. Both have a column named ParcelID. Both are not primary keys (PKs), other columns are PKs. The ParcelID column needs to have duplicates in the GIS_Data table. The other column needs to have unique ParcelIDs. A many to one relationship, if you will.

    So, I like to create a foreign key in the GIS_Data table. I have this ALTER script:

    ALTER TABLE GIS_DATA ADD CONSTRAINT FK_FIELD_DATA FOREIGN KEY (ParcelID) REFERENCES FIELD_DATA ( ParcelID )

    But I get this error:

    The ALTER TABLE statement conflicted with the FOREIGN KEY constraint "FK_FIELD_DATA". The conflict occurred in database "LAVA", table "dbo.Field_Data", column 'ParcelID'.

    Can you help me?

    Thank you for you help.

  • There must be ParcelID values in GIS_Data that don't exist in Field_Data so you can't create the foreign key because the data violates the foreign key constraint. You'll have to clean up the data before adding the constraint.

    Since ParcelID isn't a primary key in Field_Data, I would have thought you'd get the message,

    "There are no primary or candidate keys in the referenced table 'FIELD_DATA' that match the referencing column list in the foreign key 'FK_FIELD_DATA'.

    Greg

  • A primary key is not needed, but a unique constraint at minimum.

    CREATE TABLE GIS_Data

    (ParcelID int)

    CREATE TABLE Field_Data

    (ParcelID int)

    INSERT into GIS_Data

    select 1 union

    select 2 union

    select 3 union

    select 4 union

    select 5 union

    select 6

    INSERT into Field_Data

    select 4 union

    select 5 union

    select 6 union

    select 7 union

    select 8 union

    select 9

    CREATE UNIQUE NONCLUSTERED INDEX IX_Field_Data ON dbo.Field_Data

    (ParcelID)

    ALTER TABLE GIS_DATA

    ADD CONSTRAINT FK_FIELD_DATA FOREIGN KEY (ParcelID)

    REFERENCES Field_Data (ParcelID)

    --These records have not match in the Field_Data table

    SELECT ParcelID

    FROM GIS_Data WHERE ParcelID NOT IN (SELECT ParcelID from Field_Data)

    --They must be updated or deleted from GIS_Data, or inserted int Field_Data

    BEGIN TRAN

    DELETE FROM GIS_Data WHERE ParcelID NOT IN (SELECT ParcelID from Field_Data)

    ALTER TABLE GIS_Data

    ADD CONSTRAINT FK_FIELD_DATA FOREIGN KEY (ParcelID)

    REFERENCES Field_Data (ParcelID)

    ROLLBACK

    BEGIN TRAN

    INSERT INTO Field_Data

    SELECT ParcelID FROM GIS_Data WHERE ParcelID NOT IN (SELECT ParcelID from Field_Data)

    ALTER TABLE GIS_Data

    ADD CONSTRAINT FK_FIELD_DATA FOREIGN KEY (ParcelID)

    REFERENCES Field_Data (ParcelID)

    ROLLBACK

    DROP TABLE GIS_Data

    DROP TABLE Field_Data

    Todd Carrier
    MCITP - Database Administrator (SQL 2008)
    MCSE: Data Platform (SQL 2012)

  • True, there is about 1,000+ records with Parcel IDs in GIS_DATA (some of these are duplicate IDs, as mentioned). Field_Data gets populated with Parcel IDs once users add field data to the record. Currently, there are 10 records in this table. I use an identity column as primary keys in both tables.

    Is there a way to structure the tables/relationships so I don't have clean up the data?

    I could make the ParcelID in the FieldData table a primary key?

    Any other advice on my archectecture?

    Thank you,

    Paul

  • Thanks for the script. I am kind of new to T-SQL could you give me a quick summary of what this does?

    I should mention that the records with the duplicate Parcel IDs are really unique records, in some way. There are about 14 columns in a record and 3 of which have different values in them. So, only the parcel ids are the same, but the rest of the values would be different. However, the application uses parcel ids to update the records.

  • Hi Paul,

    I guess this comes down to the guts of a Foriegn Key. What you are saying in your design is that you don't want any records in the GIS_Data table with an invalid ParcelID ( validity meaning that it exists in the Field_Data table ).

    So the question is... for those records in GIS_Data that have a ParcelID not residing in Field_Data, what to do with them if they are invalid? And the answer to that is either to enter corresponding ParcelID's into Field_Data, Update invalid ParcelID's to valid ParcelID's in GIS_Data, or set them to NULL.

    If the assumption above is correct, you must scrub these records prior to creating the Foriegn Key. Otherwise, the Foriegn Key cannot be created.

    Is it true that Field_Data is the parent, and GIS_Data is the child?

    Todd Carrier
    MCITP - Database Administrator (SQL 2008)
    MCSE: Data Platform (SQL 2012)

  • More for you Paul....

    The script I built was just so I can conceptually show you a simple example with your names. Run this script against a test blank test database.

    It creates 2 tables, with one column each (ParcelID), and populates them with some ids to recreate your problem.

    I need to create a unique index on Field_Data prior to attempting to create the constraint.

    First time fails with your error.

    Begin Tran and Rollback are blocks that execute the code and then undo everything once it has completed... so I can show that firstly... deleting the invalid records from GIS_Data allows for the creation of the constraint, and secondly inserting the missing ParcelIDs into Field_Data will also allow it.

    Further more you could:

    UPDATE GIS_Data set ParcelID = NULL where ParcelID NOT IN (SELECT ParcelID from Field_Data)

    ALTER TABLE GIS_Data

    ADD CONSTRAINT FK_FIELD_DATA FOREIGN KEY (ParcelID)

    REFERENCES Field_Data (ParcelID)

    This updates the records in GIS_Data with an invalid ParcelID to NULL. This is perfectly acceptable, but may not be logical, and the foriegn key will be create.

    Todd Carrier
    MCITP - Database Administrator (SQL 2008)
    MCSE: Data Platform (SQL 2012)

  • I see the GIS_Data as the master table. This is where ALL parcel ids live. Some are duplicate IDs, but the records are not duplicates. True, some more normalization could happen with this data, but we get the data from GIS systems from Counties here in Washington State and the data comes as is. It comes in as Excel spreadsheets and then we import it to the GIS_Data table. Because the way the County GIS systems calculate acreage and other things I don’t really understand, we get these records with duplicate parcel IDs. Deleting the records would cause an alarm with auditors.

    All Parcel IDs from the Counties are all valid and we must use them to identify parcels. The application allows users to add attributes to parcel records based upon the parcel ID. Basically, the Field_Data table and several other tables hold these attributes that we enter into the application. When a user adds attributes to a record in the GIS_Data, the attributes are added to the Field_Data table and others tables.

    It’s hard me to think the Field_Data table is the parent, when it really could have no records in it and when the GIS_Data table has all of the Parcel IDs.

    Do I really need a foreign key? All is happening is when a user adds attributes to main record in GIS_DATA, a new record is added to Field_Data.

    I don’t see a need for a strong link between these 2 tables.

  • Aha, I think I understand now. How about this...

    SELECT ParcelID from Field_Data where ParcelID not in

    (SELECT ParcelID from GIS_Data)

    Are any rows returned? Do all of the records in Field_Data have corresponding ParcelIDs in GIS_Data? I bet they do, as you describe GIS_Data as the parent.

    In this case, your foreign key is pointing the wrong way, and you really want to create the contstraint on Field_Data to reference GIS_Data.ParcelID. Unfortunately, like you stated... there are duplicates, so a Unique constraint cannot be created on the column, and therefore a Foreign Key can't reference it.

    To do so really requires, as you said, normalization... or the use of a different key. Based on your statement that you don't need a strong link, a foriegn key is probably not necessary. Sure, good database design protects integrity, and in this case you are depending upon the application to get it right. Not best practice, but a practice nonetheless!

    Todd Carrier
    MCITP - Database Administrator (SQL 2008)
    MCSE: Data Platform (SQL 2012)

  • Common issue understanding the ways of the Foreign Key.

    Every now and then if I haven't done one in months... I gotta go look!

    Now which way does this thing go? Which is the foreign and which is the primary?

    Seems like that concept tripped me up when I was starting.

    http://msdn.microsoft.com/en-us/library/aa933117(SQL.80).aspx

    Todd Carrier
    MCITP - Database Administrator (SQL 2008)
    MCSE: Data Platform (SQL 2012)

  • Yes, the querry does. A Parcel ID in the Field_Data table should be in the GIS_Data table too.

    Plus, this is a temporary app, so doing it right will spend more time than it may be worth it.

    I agree. Its not the best design, but it works and I don't have many options. A "good design" could mean a lot of more code and risk.

    Thanks for all of your help.

    Paul

  • Thanks, this is great link!

    Paul

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply