Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12345»»»

Lookup Table Madness Expand / Collapse
Author
Message
Posted Wednesday, September 8, 2004 7:47 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 1:51 PM
Points: 1,113, Visits: 707
Code/MUCK tables (or more formally, entity-attribute-value tables) violate the first normal form and therefore have severe data integrity issues. Keeping an open mind, though, I would be very interested in an example of a situation in which you feel that they provided an elegant solution.

--
Adam Machanic
SQL Server MVP
SQLblog.com: THE SQL Server Blog Spot on the Web
Post #135829
Posted Wednesday, September 8, 2004 7:49 AM
SSChasing Mays

SSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing Mays

Group: General Forum Members
Last Login: Wednesday, September 8, 2004 7:36 AM
Points: 634, Visits: 1

The article was a little wordy but in general I understood what the author was getting across. For data integrity, each lookup table should be a separate entity with a foreign keys coming from each table that references (uses) the code. That said, I have also used (or been made to use) the MUCK approach, and yes, in production with some success. Of course, bad entries eventually made their way into the tables, but it was not very often.

-Vic

Post #135831
Posted Wednesday, September 8, 2004 8:49 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, December 5, 2014 8:45 AM
Points: 1,035, Visits: 412

I plead guilty to being a zealot.  I am absolutely a zealot when it comes to proper database design because I have to deal with the consequences of poor database design on a daily basis.

Now, as to your charge that I didn't add any facts, and that Code tables present an  "elegant solution", I would ask you to provide some facts of your own.  You and others have accused me of being overly wordy, and that might be the case to some extent, but I felt that it was necessary to fully illustrate the point.  Part of my "wordiness" is due to the diffuculty in trying to deal coherently with that which is essentially incoherent.  However your overly brief statement as to the "elegance" of code tables is hardly believable since you provide no evidence (and I don't believe that you ever could) unless your brevity is to be taken for proof...

Of course hierarchical data will always be around, so what?  A properly designed relational database is perfectly capable of representing hierarchies, however, hierarchical databases (to include XML) have a very difficult time properly representing the more complex (and useful) "has a" logic for data that is not inherently hierarchical.




/*****************

If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek



*****************/
Post #135842
Posted Wednesday, September 8, 2004 9:03 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, December 5, 2014 8:45 AM
Points: 1,035, Visits: 412

Thanks for the input.  This article went through a number of iterations and was reviewed by at least 5 different people for readability.  That's not to say that I couldn't have done a better job, but I'm not a professional writer. 

A few notes:  I presented the normalized table structure first because I was working on the assumption that that is the starting point and that the majority of the audience should be familiar with it.  I wanted to show how well meaning, but generally ignorant, database designers actually ruin a well designed database by including this extra step.  I further wanted to show how this extra step was not only not beneficial, it takes one back to all the problems that were resolved by normalization.

Not to blame others for any problems that might exist in the article, but many of the parenthetical points (and I know there are a number of them) were actually in response to questions my various reviewers asked.  "What about this?", "So, are you saying...?" etc...




/*****************

If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek



*****************/
Post #135845
Posted Wednesday, September 8, 2004 9:04 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Wednesday, August 6, 2014 1:19 PM
Points: 289, Visits: 72

I would definitely like to see an example of a case where a MUCK table was a good solution.  What is the value of this sort of design?  Just fewer tables?  I find that good database design not only enforces data integrity, but helps decode your data for future generations, especially if you're careful about naming conventions.  I've had to reverse engineer more than one database in my career and let me tell you that that above anything else has made me a successful database designer.  As such, I'm willing to acknowledge that certain deviations from convention are necessary for certain applications, I'm just curious as to the actual virtues of these so-called MUCK tables.

~Pam





The greatest obstacle to transforming the world is that we lack the clarity and imagination to conceive that it could be different. -- Roberto Unger
Post #135846
Posted Wednesday, September 8, 2004 9:22 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Friday, December 5, 2014 8:45 AM
Points: 1,035, Visits: 412

I appologize for the confusion.  I believe a careful reading of the article will make this very clear, but to summarize:

The three seperate tables, each with a single type of code is the proper method.  The combined table causes numerous problems and its use cannot be justified.

How's that for being non-wordy?




/*****************

If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek



*****************/
Post #135851
Posted Wednesday, September 8, 2004 9:35 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Today @ 1:51 PM
Points: 1,113, Visits: 707
The perceived virtue of entity-attribute-value tables is that they will assist in creating a simplified schema (less tables = simplicity in some peoples' minds). The other perceived benefit is that there will be a performance gain due to less JOINs.

Neither of these are real. These tables are not typesafe and all-but-impossible to constrain, thereby destroying any simplicity come maintenance time. Furthermore, if and when the database grows beyond its test size, the single-table solution will almost always result in performance issues due to table scans.

Projects I've seen that utilized these tables shared some common traits: The databases were designed by junior team members, the projects were under budget, and by the time I got there the tables had corroded into massive problem areas that no one wanted to touch. One particular project had a table of this sort with a "datatype" flag. One of the flags was 'bool', which according to the data dictionary (which had not been updated for three years), should be '0' or '1'. Of course, with no constraints in place, this wasn't maintained. When I showed up, I discovered: '0', '1', 'Y', 'N', 'T', 'F', 'Yes', 'No', and NULL. Various applications that had been coded against the database all relied on their particular token value, so fixing the issue took months.


--
Adam Machanic
SQL Server MVP
SQLblog.com: THE SQL Server Blog Spot on the Web
Post #135854
Posted Wednesday, September 8, 2004 10:11 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Tuesday, January 17, 2006 10:06 AM
Points: 28, Visits: 1
 

I've used this table structure for lookups, codes, two-field type situation for years.  Its been extremely effective and held under almost every situation.

CREATE TABLE [dbo].[LookUp] (
 [GroupName] [varchar] (25)  NOT NULL ,
 [EntryId] [int] IDENTITY (1, 1) NOT NULL ,
 [ListEntry] [varchar] (30)  NULL ,
 [SortOrder] [int] NULL ,
 [Description] [varchar] (150) NULL ,
 [DateActivate] [datetime] NULL ,
 [DateDeactivate] [datetime] NULL ,
 [Active] [bit] NULL ,
)

Its got the ListEntry (or Code) and the Description to spell it out, but the nice extras were SortOrder and the DateActivate/DateDeactivate.  That helped avoid the dreaded deletion of inactive codes.  The Active column was a little extra that I maintain through a nightly job.  But with the GroupName column, I get to combine everything together.  Also, there's the requirement that GroupName+ListEntry be unique.

On another note, despite the apology in the article, the sarcasm was much too heavy. 

Post #135864
Posted Wednesday, September 8, 2004 10:14 AM


Old Hand

Old HandOld HandOld HandOld HandOld HandOld HandOld HandOld Hand

Group: General Forum Members
Last Login: Tuesday, February 11, 2014 7:59 AM
Points: 354, Visits: 67

First and foremost a database is designed in order to establish order out of chaos.  It contains data to facilitate information sharing; however, it contains rules by virtue of relational priciples that provide meaning to that information without which no understanding is passed on to the end-user.

Who cares whether or not a particular database solution would or would not work for an application?  Who cares about performance?  Performance is a secondary requirement, not the first.  Performance can be improved by hardware, by physical database mechanisms, and, yes, by properly normalized database design.

The database should be, must be, designed for the Business Model it is trying to mimic, nothing more, nothing less.

Applications are designed to provide specific actions in the context of the GIVEN, not created, database model.

The whole context of whether or not MUCK tables would be good here or there or in this or that application is pointless.  They are a bad database design for all applications...period.

If you want a MUCK and you want the application to manage those rules, feel free.  Create a VIEW that UNIONs all the entities together for you, create a class, embed and "encapsulate" all of the code muck you'll have to create to maintain it, and present it to whatever application desires its usage.  Although, I will presume it will have performance "issues."

But for the database, follow the rules...the relational rules.

 




Post #135866
Posted Wednesday, September 8, 2004 10:15 AM


SSChasing Mays

SSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing MaysSSChasing Mays

Group: General Forum Members
Last Login: Thursday, October 30, 2014 8:25 PM
Points: 649, Visits: 210

I too thought the article was wordy.  It was quite refreshing, getting a plateful of detail and depth on a subject I've burned a lot of thought on, instead of the talking head sound bytes so common in many industry magazines (but not that common on SQL Central!)  After going back and forth on the "what's best" subject several times, I too have finally settled on the one table per lookup solution, for reasons covered by both this article and a similar one I found by Joe Celko--run, don't walk, to learn from Joe Celko--at http://www.dbazine.com/celko22.shtml.

The arguments I had for "MUCKing" lookup tables was I didn't want to have to support dozens and dozens of two-column tables with three rows each (and maybe that was just our application).  I also wanted a single, simple code-lookup table, where I--and anyone else--could quickly and easily find all the codes used by our disturbingly complex system.  But I also wanted DRI, the assurance that all the data was indeed good and valid, and you just can't get that with a MUCK, not without triggers (urg) or application-enforced data integrity (yeah, right).

These days, if I want a one-stop source for code looukp, I'd make a view "concatenating" the various source tables, something like (warning, psuedo code follows):

CREATE VIEW CODE_LOOKUP
  (
    Code_Type
   ,Code_Value
   ,Code_Title
   ,Code_Description
  )
 as
  select 'A', Code_Value, Code_Title, Code_Description
   from Lookup_Table_A
  union select 'B', Code_Value, Code_Title, Code_Description
   from Lookup_Table_B
  union select 'C', Code_Value, Code_Title, Code_Description
   from Lookup_Table_C

This code is not checked, and details and mileage may vary by application and implementation, but you should get the idea.  I don't think this would be suitable for application use, but for quick reference (particularly for developers and system trouble-shooters) it can be invaluable.

As for "zealotry", Mr. Peterson pales in zealotry when compared with anything by Fabian Pascal (http://www.dbdebunk.com), whose writings support this article strongly.  Of course, it isn't "zealotry", but rather "conviction and the will to stand up for it"; it only becomes zealotry when--rightly or wrongly--you strongly disagree with what they say.

   Philip Kelley

 




Post #135868
« Prev Topic | Next Topic »

Add to briefcase ««12345»»»

Permissions Expand / Collapse