Lookup Table Madness

  • The reason this thread gets so many posts is because there are an infinite number of ways to do something wrong.  The drug/allergy example would be one of them.

    First of all, the "Codes," master or otherwise, are mearly a convenience to the physical database; they are only but surrogates and meaningless outside of the system.  As such, they should NEVER be exposed to end users (by the way applications are clients too).

    Furthermore, you cannot have a surrogate until you have KEY to surrogate to begin with.  I've seen too many designs where these "Codes," "IDs," or "GUIDS" where the only defined key in the system.  No wonder this stuff is riddled with bugs.  Keeps the vendors and application support personnel in business though... Planned Obsolescence?  That might actually be one of the valid reasons for bad design.

    You have no data integrity if you cannot even define the container in which the data resides.  As someone else said, "It is only a bucket of crap."  Well, it sure smells like it, even if that is not what the developer intended.

    In this example, Drugs are NOT allergies.  Drugs are drugs and need to be defined and stored that way.  Patients (which wasn't defined but would need to be) HAVE allergies TO drugs, only those which are to that patient.

    Sounds like a new entity to me...Patient Allergies, with an FK to Patient and an FK to Drug.

    Now, perhaps I've missed your point; so, please clarify if I am wrong.


    Anthony Thomas


  • Just to clarify, there are no codes exposed to users. 

    And yes, there are separate entities handling associations between patients and drugs and a different entity of PatientAllergies.

    The specific question was in relation to codes that are appropriately used in multiple places in the data, with different pick lists for different uses.

    Thanks so much.


  • I guess the lines might blur slightly when you have a table of drugs, some of which could be used to treat multiple conditions.

    EG, we are developing an electronic forms app for healthcare.  A lookup table of drugs is obviously needed.  In different areas of the form the user can choose from a set of drugs.  The choice could be for pain medication, anti inflammatories, antibiotics, etc.  It may be that some drugs are capable of being used in several areas - eg Ibuprofen (Nurofen to most Aussie folk) is both an anti inflammatory as well as pain medication.

    I could see a single lookup list of drugs being used.  Another table could store the potential uses of drugs (or this is hard coded as the potential uses won't change without the app changing).  Finally another table could store a combination of drug + allowed usage.  Thus the elec form could go to the DrugUsage table to find all available drugs for a given use.

    This allows the drug ibuprofen to be considered the same in different areas of the form.  smknox's post made me think to write this - not sure how relevant it is now that I further think about it, but it should at least contribute to the lengthy discussion

  • Just a thought: isn't it possible that a patient can be allergic to a combination of drugs, but not the drugs individually? Would your system handle that combination?

    Based on what has been said, tho' a bit hard to follow, my recommendatino is to optimize for flexibility.

  • It's typical for patients to be allergic to several individual drugs (pencillin, sulfa, others), but not necessarily a combination of drugs. I take your thought to mean that they'd have to take penicillin and sulfa together to get an allergic reaction. No, that isn't an issue.

    Drug reactions are an issue, whereby two drugs given to the same patient can interreact.

    Trying to get to the heart of this issue, I return to the example many posts ago of employees and clients both using the same lookup table form something, say type of phone number (I couldn't find the post). Then later you want to change the terminology for employees but not clients. You have a problem. So it would have been better to have every individual pick-list populated from it's own specific lookup table.

    I'm just saying there would be a lot of redundancy if it's something like a list of drugs, and there is some merit to consolidating that into one large list and selecting certain drugs for different pick-lists. An overriding priority is that the same drug is coded consistently. You're introducing potential error by listing the same drug in 10 different lookup tables and keeping that drug's code consistent across those tables.



  • What in the world is that garbage: "optimize for flexibility," "optimize for speed."  Try this one out for a change.  How about "optimize for correctness and integrity."

    You tell me, is it more "optimial" to have to go to only one location to create or modify the data, or to have to search for duplicates and variants all over the place?  Is it more "flexible" to describe and model the data so you can ask it an infinite number of questions, based on the end user's point of view, or only narrowly defined by a single structure the developer invisioned?

    What people mean when they say stupid things like this isn't "optimal" or "flexible" for the end users.  What they really mean is that it is "optimal" (in the sense of less development time) and "flexible (in the sense that the developer can quickly get the request off of his or her desk) for the developer, not the client.

    Like many in our field, I began as a developer.  I worked with the file-based, so called database, systems.  I've worked with the DBMS and relational systems.  If you follow the rules, the RT is THE most "flexible" and "optimal" system out there.  But you have to put your energy into the design so you don't have to put the energy into the usage.

    Can the solution above solve the combination drug allergy, no; it was not apart of the original requirements, but can be easily extended to handle such scenarios.  And, when it comes to the data migration component, it will be a lot easier to take the single set of data and convert it than if you had to chase all that information, spread out all over the design, and have to contend with the myriad of transforms you would have to go through to squeeze that square peg into the round hole.


    Anthony Thomas


  • Word, Anthony; but ouch! Can we just put this to rest? SQLServerCentral lets a lot of different people write articles, some with better ideas than others. Many people don't want to design relational databases but want to use a RDBMS anyway. Don't ask me why; it's a mystery.

    Let those people write what they want and pray to God you are hired to replace them and their slow and buggy code instead of being forced to update or maintain it.

    If there's one thing I've learned it's that arguing with people who don't understand is pointless at best, infuriating at worse. In the spirit of completely not getting it, I say "Let them eat cake"!


    Signature is NULL

  • I agree that this gets old, but isn't this a public forum?  If some are allowed to write this dribble, then aren't others allowed to respond?  And, the dribble keeps on coming.

    What are you really concerned about?  That I am rough?  Are the arguments so shallow that they need your...protection?

    I am not agianst you; I'm really not against those who write comments like that.  How could I be; I don't even know them.  Nevertheless, if one puts up an argument, in a public forum, isn't the public allowed to point out the flaws in the logic?

    "Many people don't want to design relational databases but want to use a RDBMS anyway."

    This is the problem.  Many people may want to drive a car, but if they do not want to follow the rules of the road, then they get their driver's licenses revoked, and they have to find other means of transportation.  Do you think you can just go get a pilot's license whenever you feel like it just because you think it would be really "neat" to use that tool?

    If one tries to use an RDBMS for non-relational solutions, they should have their keyboards and mice taken away, never to "solve" again.  Come on, are these really "solutions," or are they just delaying the problem and causing others to solve new problems introduced by bad design?

    Look, there are other technologies out there: files, XML, XML DBMSes (I smell an oxymoron), OODBMSes (and again), MVDBMSes (worse).  Let them go clog those systems up, I won't get woke in the middle of the night when one of those systems comes crashing down and you have all the "C" executives screaming down, "How could this happen?"

    Let them eat cake....  Bah!  Let them eat the swill they keep trying to shove down others throats.


    Anthony Thomas


  • Anthony,

    Can't agree with you more. But calm down, I'm on your side, as is reason and experience! You seem to think I've advocated MUCK tables at some point? If you'd read my posts you'd see quite the opposite; I've been extremely vocal against them!

    If someone wants non-relational data it should obviously be stored in a non-relational way and not in a Relational DBMS. This is so basic that the argument over it strikes me as slightly ridiculous.

    All I'm noting is that there are over 170 posts on this particular forum, and that it's been going on for TWO YEARS! Two years of beating a dead horse, with the occasional "I don't see what the big deal is" post to get everyone going again.

    I'm simply suggesting that enough is enough, and that comment is not aimed at you, but at the entire discussion.

    Signature is NULL

  • Couldn't agree more.

    A good article, but the subsequent discussion has pretty much run its course.


    Anthony Thomas


  • The personal attacks are not warranted. When responding to a post here, it is really just good manners to assume that the person who posted the reply is intelligent and knows what s/he is doing.

    That said, "good", "correct" is in the eye of the beholder. If you read back a ways in this thread, you'd read where yes, development time is much more important optimization than sheer speed in some application spaces. In the clinical trial world, if a routine takes 10 or 20 or 30 minutes, it usually doesn't matter a bit. However, if the developer wastes a day or two optimizing those 20 minutes away, that's not good.

    Plus, you have systems where the metadata can and does change in the middle of the trial. You need to handle that, not kvetch that the specs have changed, I can't work in these conditions.

  • As much as I would like to agree with you, the posts here are not at all uncommon from those I work with, and, unfortuanately, more often than not, they DO NOT know what they are doing.

    I understand your situation, but I would suggest that in your position, you are attempting to use the wrong tool.  If using it correctly just takes too much time, then find a different architecture that suits your situation better.

    But then, again, what do DBAs know?  Developers are king, right?


    Anthony Thomas


  • It is important that incorrect (again, depends on your point of view to some extent) posts are debated.  It wasn't really that "ouch" from my point of view but again that depends on the point of view.

    Glad I generated some discussion   To return to the topic (and then leave this thread to die in peace) I thought that it would be beneficial to have a common drugs table so that they were coded consistently.  Each drug has one or more applications - the area of the form dealing with a particular application gets those drugs in the list that may apply in those circumstances...  If a drug changes name, then this is easily amended in one place.  The drug when used as a pain relief medication vs when used as an anti-imflammatory is still the same drug.

    Drug interactions are a whole different ball game - one which we don't want to touch as others do it very well already with quite affordable software and regularly maintain their list of drug interactions.  Our software is also targeted at specialists more than general practicioners - the specialist doctor doesn't have the full patient history (certainly not electronically anyway) so drug interactions & allergies detected by the computer cannot be reliably performed.

    I also don't really know how to interpret "optimal".  I tend to go for ease of use for the user, avoiding designs that need strange explanations and hopefully avoiding the cringe when I or someone else looks at the code / design later


  • No, users are king. Liberace said it best, "Without the business, there's no show". How you optimize should depend on the business needs first.

    OK, lemme ask you. There are 250 clinical trials, each heterogensously structured, each with about 30-40 different table structures. The analysis software is run maybe 3 or 4 times in production. The analysis can easily be run in the background as the user does other things.

    So, what do you optimize for?

  • You are absolutely correct, the Business Model rules because it is the model that contains all of the business requirements.

    The Data Model models the data about which we describe the Business Model, from which the database is derived.

    The Application Model, however does not describe the business nor the data, merely the business requirements as to how the data is processed.  However, how do you model the process until you have all of the inputs, the business requirents and data structures?

    To often, the Data Model (if there is one at all) is constructed or derived only after the application process has been constructed.  It is this that I contend that is backwards.

    As for your clinical trials, clearly you have provided insufficient information from which to derive any meaningful discussion.  However, as far as what you optimize, optimization is last.  You model the business; you model the data, for integrity because without it, the storage is meaningless, even if optimal, I assume, for speed.

    The whole point of normalization is for the reduction of data anomolies, from which the data maintains its consistency.  Without this, the rest is fruitless.  A side effect of normalization is performance optimization.  Now, this is argued against quite often due to the JOINs that must take place.  However, if you consider all of the performance issues, not just the JOINs, but include the time necessary to maintain the integrity of the system, you will see that the normalized database is more efficient.

    I really do not believe either one of us will agree with the other on this point, as you have brought up your clinical trial data before.  However, you have yet to provide the structures of this data, either as you have designed it, or as we have suggested.


    Anthony Thomas


Viewing 15 posts - 166 through 180 (of 202 total)

You must be logged in to reply to this topic. Login to reply