No More Foreign Keys?

  • call.copse (9/3/2015)


    ZZartin (9/2/2015)


    RonKyle (9/2/2015)


    I guess why I find this topic so interesting is that I keep hearing this "the consequences will dire" statement about not using foreign keys and I just don't see it

    I guess you've never had to explain why parts of orders are missing and so on. How "dire" the consequences are is perhaps open to interpretation. But for the customer who didn't get an order filled because there were line items but no header the impact is consequential.

    This is a topic that speaks to professionalism. We don't expect people who build our homes to take shortcuts. We who build databases shouldn't as well.

    Okay.... and in that example with foreign key constraints in place the customer still wouldn't get their order because not only would there not be a parent order there wouldn't even be line items which would make trying to track down the issue that much harder.

    Not to sound harsh, but have you ever done any real, complex database development work? It certainly sounds like you haven't.

    For instance, typically what might occur is that some peculiar web hiccup allows an invoice to be entered without a customer. Then some developer's assumption is proven wrong and the invoice screen and / or reports all stop working. The company loses money whilst this is fixed. This is a simple and obvious example that could of course be avoided - but there are a few thousand variations on this theme that can, do and will occur without foreign keys.

    It's clearly not a universal panacea to stop inconsistent data but it's practically a free check if designed in early, that allows the developers to concentrate on the real issues that will always come up on anything of any significance.

    It's nice to see you have extensive knowledge of my work experience based on my one opinion of one feature.

    I've never said I think foreign keys are bad, all I've said is that I don't understand why some people consider them a must have. If an application is so badly designed/developed it is consistently trying to assign bad foreign key values that is a problem that should be fixed at the root cause not by hiding it behind DB errors(once again as I said if an application is assigning bad foreign key values how do you know that even when it doesn't trigger a bad foreign key constraint it's actually assigning it to the right parent). While it's nice to have the extra check in the DB the work of determining appropriate foreign key values should already have been done in the application correctly, I don't think that's an unreasonable expectation.

  • ZZartin (9/3/2015)


    While it's nice to have the extra check in the DB the work of determining appropriate foreign key values should already have been done in the application correctly, I don't think that's an unreasonable expectation.

    What about changes that don't go through the application? Suppose you raise a support issue and a junior technician sends you a script to fix it - who or what will do the checking then? What when the application vendor is taken over and a whole new group of programmers takes the code in a different direction and accidentally (or otherwise) leaves out the relationship checks? No, database integrity should be enforced in the database layer.

    John

  • The purpose of foreign keys are understood by all here.

    But, aside from it's impact on bulk load performance (typically not an issue for OLTP databases), can someone recap what the cons of foreign keys are?

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I suppose you could argue the cost is in complexity and rigidity. Depending on the scenario that can be a good thing (you need structured, efficient data). I think we would all benefit by embedding DBA's into development teams if only to get better data models and scalable applications.

  • ccd3000 (9/3/2015)


    I suppose you could argue the cost is in complexity and rigidity. Depending on the scenario that can be a good thing (you need structured, efficient data). I think we would all benefit by embedding DBA's into development teams if only to get better data models and scalable applications.

    Developing a microservice layer just to enforce data validation rules sounds more complex than creating foreign keys in the database layer. It certainly would require more lines of code. Regidity is the point, which is why they're called constraints. Foreign key constraints do force the application developer to persist data as a consistent unit of work, which probably requires them to write better quality programming code.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Even if a case can be made to not use Foreign Keys, you still have to live with "Low Tech Ley" wish to fragment the database into separate (micro)databases. Referential integrity isn't a problem that "Low Tech Ley" is trying to solve, it's a problem created by his solution to his actual problem.

    I see "Low Tech Ley" problem as wanting to be able to make a change to single micro-service without impacting the other micro-services. His micro-services are highly decoupled except when it comes to the DB, because they all reference the same schema. A schema change for one micro-service is in danger of propagating changes to other micro-services at the same time. His services are no longer independently deployable so he's sad. What is a better solution to "Low Tech Ley" problem without forcing him to give up his ORM that is polluting all his Java/C# code with schema details?

    EDIT: The bit I don't understand about "Low Tech Ley" solution is that after segregating all the data so that each logical database entity is the "responsibility" for a single micro-service, why does he still need to split the database up physically? He's already done it logically.

  • @ronkyle

    Not you indeed, I'd not be questioning you, as we were both arguing in favour - sorry for any confusion.

    John Mitchell-245523 (9/3/2015)


    ZZartin (9/3/2015)


    While it's nice to have the extra check in the DB the work of determining appropriate foreign key values should already have been done in the application correctly, I don't think that's an unreasonable expectation.

    What about changes that don't go through the application? Suppose you raise a support issue and a junior technician sends you a script to fix it - who or what will do the checking then? What when the application vendor is taken over and a whole new group of programmers takes the code in a different direction and accidentally (or otherwise) leaves out the relationship checks? No, database integrity should be enforced in the database layer.

    John

    Quite.

    Also how else do you find about the bad allocation of records? With the foreign key you get a nice logged error (at least in any sensible application) that helps you track down the misallocation.

    If you add a new application that does some work to an established database can you be sure it will do the right thing? If you need to clean up some customers but maintain an order trail - what stops the accidental deletion of the wrong ones? I just fail to see how they are ever not a useful thing - sure, not for the odd table say in an import situation or something but otherwise....

  • ZZartin (9/3/2015)


    I've never said I think foreign keys are bad, all I've said is that I don't understand why some people consider them a must have. If an application is so badly designed/developed it is consistently trying to assign bad foreign key values that is a problem that should be fixed at the root cause not by hiding it behind DB errors(once again as I said if an application is assigning bad foreign key values how do you know that even when it doesn't trigger a bad foreign key constraint it's actually assigning it to the right parent). While it's nice to have the extra check in the DB the work of determining appropriate foreign key values should already have been done in the application correctly, I don't think that's an unreasonable expectation.

    In World War I the trenches and machine guns initially proved to be unbreakable. Then, a way to break them was found. After that they developed defense in depth, a series of multiple defensive lines that provided more than a single point of failure. At the very least, and I would argue that there are a lot more reasons, FKs provide us defense in depth instead of relying on a single point of failure at the application layer.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (9/3/2015)


    ZZartin (9/3/2015)


    I've never said I think foreign keys are bad, all I've said is that I don't understand why some people consider them a must have. If an application is so badly designed/developed it is consistently trying to assign bad foreign key values that is a problem that should be fixed at the root cause not by hiding it behind DB errors(once again as I said if an application is assigning bad foreign key values how do you know that even when it doesn't trigger a bad foreign key constraint it's actually assigning it to the right parent). While it's nice to have the extra check in the DB the work of determining appropriate foreign key values should already have been done in the application correctly, I don't think that's an unreasonable expectation.

    In World War I the trenches and machine guns initially proved to be unbreakable. Then, a way to break them was found. After that they developed defense in depth, a series of multiple defensive lines that provided more than a single point of failure. At the very least, and I would argue that there are a lot more reasons, FKs provide us defense in depth instead of relying on a single point of failure at the application layer.

    True to that.

  • Eric M Russell (9/3/2015)


    ccd3000 (9/3/2015)


    I suppose you could argue the cost is in complexity and rigidity. Depending on the scenario that can be a good thing (you need structured, efficient data). I think we would all benefit by embedding DBA's into development teams if only to get better data models and scalable applications.

    Developing a microservice layer just to enforce data validation rules sounds more complex than creating foreign keys in the database layer. It certainly would require more lines of code. Regidity is the point, which is why they're called constraints. Foreign key constraints do force the application developer to persist data as a consistent unit of work, which probably requires them to write better quality programming code.

    I agree, but your question was more general in nature or so it seemed. If you are designing an application that generates a lot of unstructured data (log flavored info for example), you might want to go easy on the RDBMS components or not use them at all. If you're lucky enough to have clear requirements that the application is central to the design (mobile app perhaps?) and the data is secondary in importance then I wouldn't probably pick an architecture that is too rigid. Payment system? Almostly certainly an RDBMS and the orthodox components. I think Steve picked a good topic if only because it's one of those DBA favorites 'it depends' issues.

  • It's a huge "it depends" issue. More and more I think there are domains where we are assuming that an RDBMS is the best fit. As I look to see where other types of NoSQL systems fit in, I think we've used RDBMSes at times where they aren't necessarily needed.

    However. If you have a model that uses an RDBMS, especially in a legacy (existing) application, FKs are a layer of defense (as Grant mentioned). This provides for guaranteed matching of parent/child rows when the application might not enforce this. Or changes are made outside of the application.

    I think that expecting all changes to be made in the application is naive. It's idealistic. Applications will forever be evolving. That means they may not have functionality needed by users, especially the tooling for maintenance operations. There may be multiple applications. The code may have bugs. A FK prevents those issues from hitting data.

    Downsides? You must have matches, which may not be what you want, especially in the short term. Maybe I want orphans as the data model changes. Maybe I want to be able to load data quickly in parallel in parent/child tables and I know the data is good. There are times you might disable the FK, but in general, I think this is something that should be in place most of the time if a relationship exists.

    The microservice architecture is another battle. I think this makes sense for some domains of problems, but it also is likely to cause lots of issues in others. We do need data to relate to other pieces of data and it's not as simple as copying the data to another service or querying across services (which arguably breaks the microservice model).

    I suspect, however, that what we need in some apps is "smaller" services, or mini services, not micro ones. However we'll have to build applications and work out where we can make database splits and still have an application perform well. I'm not confident we'll find these splits because in most cases your scale is low enough that it's not worth attempting to split a monolithic database. Or it's large enough that you have an issue that doesn't need exact concurrency and work with an SOA/NoSQL/ETL architecture to store and move data.

  • I don't suppose we have someone on the boards who works in the online gaming industry? Specifically the MMORPG or the like? My information is pretty old and limited but from what I've read they basically trade space for speed. By that I mean they basically cache their user data in different data centers on different continents/regions and route the queries through a custom built distributor (or similar component) to get it the user as quickly as possible. I'm really curious how that influences their schema design (if at all). I would think that they would be faced with similar distributed application/data challenges that most of us have in the traditional business sector.

  • ccd3000 (9/4/2015)


    I don't suppose we have someone on the boards who works in the online gaming industry? Specifically the MMORPG or the like? My information is pretty old and limited but from what I've read they basically trade space for speed. By that I mean they basically cache their user data in different data centers on different continents/regions and route the queries through a custom built distributor (or similar component) to get it the user as quickly as possible. I'm really curious how that influences their schema design (if at all). I would think that they would be faced with similar distributed application/data challenges that most of us have in the traditional business sector.

    I did. I worked on a number of MMORPG's over the course of 7 year span where we used Oracle. But I was not a database professional although I did work in the same area of the database team. I can poke some of the guys still working on those games and let ya know some more details if you like.

    What I do know is that the database used to be a constant bottleneck simply due to the high amount of transactions and reads needed in order to support hundreds if not thousands of players simultaneously across persistent worlds. Due to that, I do believe services were created to sit in front of the database that basically cached data in memory per server. When it came to game logic that required data from players, items and other world assets, that was typically cached on the client end as long as possible to reduce latency.

    On that note, MemSQL has been known to be used by FarmVille and the last online game I worked on replaced Oracle with CouchDB (NoSQL) with much improvement in performance.

    Check this out on what Playfish did -> http://highscalability.com/blog/2010/9/21/playfishs-social-gaming-architecture-50-million-monthly-user.html

    And here are some cool quotes from EVE online DBA. Eve Online is one of the few games that runs one world architecture. That means all players play together across multiple game servers with multiple instances (i.e.: private playfield that can be loaded or deloaded depending on need). They have over 1,000 players playing together in close proximity without bringing down the database or game servers.

    Size - When talking about the size of their databases supporting EVE Online. They have multiple databases supporting game assets and player information.

    We also have difference services (sso/vgs/images) in different DB's, so while the core DB itself is around 2 TB (2.8 w/ free space in files for growth), the entire stack is closer to 5-6 TB total.

    Item Iventory - Iventory for items gained in game. I've seen some MMORPG's handle this horrifically to where bags, containers and so forth cause DBA nightmares (i.e.: bags of stuff within bags of stuff).

    The inventory system is around 250 GB - this includes multiple indexes on the same data though so we (or you) can access it faster, etc...

    every item/stack per location has a unique row in the DB, for a total of around 2.2 billion rows at the moment 🙂

    Ram

    The most insane part of our DB is ram IMO - we're running 512 GB of ram, so it's able to pull a tremendous amount of data into memory and that gives us great speed.

    That much RAM coupled with fast SSD's (and properly indexed data) and you can do quite a bit 🙂

  • This is good stuff. Thanks for sharing. I found 'Lessons Learned' particularly confirming. Especially number 1:

    1.)Build an architecture that takes advantage of the special nature of your application. Playfish has tailored an architecture to their very specific needs. Don't try to build a general architecture that will scale everything. Take advantage of the nature of your space to make life as easy as possible.

    I think we, as human beings, love to generalize and create models. It's amazing how unsuccessful we are at it given how long we've been doing it. Maybe we need a lot more models? It seems to me that programs tend to (over time) behave more like primitive life forms than elegant mathematical functions. Very messy and complicated. They seem to morph into the step-children of Frankenstein and Rube Goldberg. In short, they become increasingly irrational systems. That would be tough to model for anyone. But hey! Maybe it's just the coffee talking...:w00t:

    Thanks again.

  • ccd3000 (9/8/2015)


    This is good stuff. Thanks for sharing. I found 'Lessons Learned' particularly confirming. Especially number 1:

    1.)Build an architecture that takes advantage of the special nature of your application. Playfish has tailored an architecture to their very specific needs. Don't try to build a general architecture that will scale everything. Take advantage of the nature of your space to make life as easy as possible.

    I think we, as human beings, love to generalize and create models. It's amazing how unsuccessful we are at it given how long we've been doing it. Maybe we need a lot more models? It seems to me that programs tend to (over time) behave more like primitive life forms than elegant mathematical functions. Very messy and complicated. They seem to morph into the step-children of Frankenstein and Rube Goldberg. In short, they become increasingly irrational systems. That would be tough to model for anyone. But hey! Maybe it's just the coffee talking...:w00t:

    Thanks again.

    No problems.

    Ideally, the reason I am working with data today versus when I was in the game industry was due to those crazy DBA's managing these online worlds. I mean when you think about it, all these online games are nothing more than large databases with pretty pictures and sound.

Viewing 15 posts - 46 through 60 (of 63 total)

You must be logged in to reply to this topic. Login to reply