No More Foreign Keys?

  • Comments posted to this topic are about the item No More Foreign Keys?

  • Yeah, have to agree with this one. If you are in a situation where an ID/Value approach is going to work better, get the heck out of an RDBMS like SQL Server and into one of the other database systems like DocumentDB. As was demonstrated on that site, the desire to remove primary keys and foreign keys actually comes from a place of ignorance, not a technology driven decision.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • All too true. Any database of any size that doesn't have primary keys will eventually have duplicates. Any database of any size that doesn't have foreign keys will eventually have orphans. No matter how fast a system is, if the data has integrity issues, then some of its results, whether a customer order or a report or whatever, will be affected. At least where databases that are appropriately done in SQL Server, it will always be so.

    Some database designers cause some of the issues. I have seen database designs that are poorly done in this regard, with every possible foreign key relationship made (I call them spider web designs). There is something about this that seems intuitive rather than mathematical, so maybe it's more difficult than I think. And a good number of OLAP database designers also don't think FKs are important ("the ETL takes care of it"). I personally think it's an unwillingness to do the exacting work necessary, but that's just my two cents on it.

  • I agree.

    In addition to eventual duplication, I also think transition to a data warehouse will be difficult at best, and the resulting information unreliable.

    A world without foreign key relationships is like a world without genealogy. Who's your daddy?

  • Complaining about foreign keys is sort of like when folks complain about those cement barrades that divide north bound and south bound lanes on a highway or the speed bumps in parking lots. It makes me wonder what folks would do if these safeguards weren't in place. What exactly were they trying to do when the constraint got in their way?

    True, foreign keys can be a bottleneck when bulk loading tables, but transactional databases are typically not bulk loaded, and when they are, it's only on a periodic basis or maybe a one-off migration process. For those occasions you can schedule a maintenace window where the foreign keys are disabled (truncated) prior to the load and then re-enabled (re-built) afterward.

    Arguments against foreign keys are more reasonable when in the context of a data warehouse. One option to the have constraints in the intermediate staging environment, so the data gets validated at that level, but are not created in the data warehouse. However, unless there is an important reason to minimize the load time, you may still want to keep foreign keys in the data warehouse as well, because in addition to certifying data integrity they are also helpful for performance reasons in that they help SQL Server construct efficient execution plans in a way similar to table statistics.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Hmm.... primary keys(or at least unique constraints) are very important as duplicates are a pretty big issue and if they're derived from natural keys are a very good way to prevent invalid data entry. My opinion on foreign keys is that they are much less important, orphaned records are not necessarily a big deal and they don't prevent the application from having to do work up front to make sure they're populated properly nor do they protect against bad data entry or sloppy front end application design from assigning foreign keys incorrectly in the same way primary keys do.

  • However, unless there is an important reason to minimize the load time, you may still want to keep foreign keys in the data warehouse as well,

    If a table has a heavy incoming load (millions of records) and there are few if any rows, I'll drop the keys for the load and add them at the end.

    An OLAP design will have the same issues. I've seen it before in large companies with large warehouses. Someone has classified "the ETL will take care of it" as his favorite of the big lies. I agree. I recently ran into some issues at another site. The response was "something is wrong with the ETL." No--something is wrong with the design that it can't catch it. I will tell anyone without these constraints that the integrity of the data can't be guaranteed. Kimball says in his latest book it's a critical aspect of the design.

  • ZZartin (8/31/2015)


    ...

    My opinion on foreign keys is that they are much less important, orphaned records are not necessarily a big deal and they don't prevent the application from having to do work up front to make sure they're populated properly nor do they protect against bad data entry or sloppy front end application design from assigning foreign keys incorrectly in the same way primary keys do.

    I'd be careful here. This is highly domain dependent. You might not have worked in a problem domain where orphans caused issues, but they can be very, very problematic for a business in some cases.

  • Sorry, but I think it's nonsense to suggest that FK are an inherent weak point in your architecture. It sounds like he wants/needs simple data stores and not necessarily an RDBMS to fit his model. I'd argue that this has more to do with his physical model constraints (performance limitations like latency etc.) and the trade offs of coupling and decoupling components than an RDBMS' limitations. Maybe one of the first questions to ask is: are we talking about a structured or unstructured data solution? Or did I miss it somewhere? If it's structured then I would think a more data-centric architecture is called for. If it's unstructured then I don't really take much issue with it.

  • Orphan records can pose serious problem in an operational or OLTP database. However, they are less of an issue in a data warehouse. There is this thing called "late arriving dimensions", where fact records arrive coded with things like Customer IDs or Product IDs that havn't been loaded yet. In a DW where reports are released periodically and most user interaction is in the form of ad-hoc queries, orphan records are more of a nuisance or known issue that gets resolved shortly with a followup data load.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • The applications that I've seen that are missing foreign keys have (at least these) three issues.

    1. Bad data in fields that should be clean.

    2. Extra deployments of the application code to add new and remove all enumerations that exist in lieu of type tables that are used with the foreign keys.

    3. Over time, there is a lack of understanding of what the keys are and the developers "don't have time" to go dig all the values out of the application. So then BI can't have the data they need to interpret the keys.

    Furthermore, if the developers are really that passionate about the entities being separated into groups, then they should use schemas, but keep all the tables in the same database.

    Mickey Stuewe
    Sr Database Developer
    My blog
    Follow me on twitter: @SQLMickey
    Connect with me on LinkedIn
    --------------------------------------------------------------------------
    I laugh loudly and I laugh often. Just ask anyone who knows me.

  • Steve Jones - SSC Editor (8/31/2015)


    ZZartin (8/31/2015)


    ...

    My opinion on foreign keys is that they are much less important, orphaned records are not necessarily a big deal and they don't prevent the application from having to do work up front to make sure they're populated properly nor do they protect against bad data entry or sloppy front end application design from assigning foreign keys incorrectly in the same way primary keys do.

    I'd be careful here. This is highly domain dependent. You might not have worked in a problem domain where orphans caused issues, but they can be very, very problematic for a business in some cases.

    Agreed. When the foreign key you want to do away with matches payments to invoices, duplicates in either direction would be a very big deal indeed. Not being able to conclusively match a patient to the bed they're currently in is also a major issue.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • This is nuts. Am I the only one who read to the end to notice the footnote by the author ("Low Teck Loy")

    * if you hadn’t caught on, just a jest … 🙂

  • I guess there will always be differences of opnion between those who see a database as basically a data reposity for an application, versus those who see a database as collection of normalized relational entities with no functional dependency on any specific application. Considering the pace at which applications are ramped up, retrofitted, and replaced, I think it's more in the interests of the organization to build the database in a normalized fashion. The database should transcend the lifecycle of the various applications, processes, and people who revolve around it..

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • debby_browne (8/31/2015)


    This is nuts. Am I the only one who read to the end to notice the footnote by the author ("Low Teck Loy")

    * if you hadn’t caught on, just a jest … 🙂

    It was the discussion about the post that was most interesting. And the stuff in there was not a jest.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 15 posts - 1 through 15 (of 63 total)

You must be logged in to reply to this topic. Login to reply