SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Lookup Table Design


Lookup Table Design

Author
Message
bnordberg
bnordberg
SSC-Addicted
SSC-Addicted (469 reputation)SSC-Addicted (469 reputation)SSC-Addicted (469 reputation)SSC-Addicted (469 reputation)SSC-Addicted (469 reputation)SSC-Addicted (469 reputation)SSC-Addicted (469 reputation)SSC-Addicted (469 reputation)

Group: General Forum Members
Points: 469 Visits: 572

I've have worked with lots of systems that utilize the 1 table approach to lookups. And my opinion is that they should never, ever be used. Imagine making a datawarehouse that relies on loopups for values. Analysis services will not allow complex joins with where statements, so you will need to make views that combine all the data. Now take the main table (say a list of employees) and say you need to return 18 different lookup values. With 1 lookup table structure your where statement is huge and unless well indexed, the joins will perform terribly.

Anyone that has ever done analysis on data from a system with 1 lookup table will tell you that it is a bad design and difficult to work with.

Now think of table size for your 1 lookup table to work you need an extra field to store what field the value is used for. Suppose you have 500 lookup values, you could create another table for values and maybe only store an int for the field value, but then you need to do another join to figure your values out. Anyway having 1 lookup table will take up more physical space regardless of the design.





Peter Kryszak
Peter Kryszak
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1348 Visits: 3

In the case of data warehousing, I agree. Each dimension should have its own table for performance and maintenance reasons. More importantly, dimensions are often more than just code lookups and may have many other attributes for each value, specific to the dimension's type or meaning. Generally speaking, the number of dimensions in a fact table is low so the number of tables is not great, at least for a group of similar fact tables.

In the case of an operational database, I am quite happy with the one table approach. Our table is properly indexed and performs well. These are codes that simply replace repeating text and don't have distinct attributes for each type. We would require more than 100 code lookup tables to track all the different code types in this database if we separated them out. That would be a development and maintenance nightmare.

In any case, you can and should avoid the ugly WHERE clauses and poor performance with one or more user defined functions for code lookup and proper indexes on the code table.





Leo Peysakhovich
Leo Peysakhovich
Old Hand
Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)

Group: General Forum Members
Points: 340 Visits: 350

As you can see in article, real life scenario 2 is more complicated than the one table. It consists of 4 tables (in my case). I don't like codes usage and linkage to be done in the same table (recursive tables with unlimited number of levels). Based on 4 tables you DO NOT HAVE MORE THAN 1 LEVEL of dependency. And codes are separated by groups. Plus, I store ONLY GENERIC codes in the structure. The codes that are not company specific. Company specific codes are in their own tables based on scenario 1. This allows reuse generic codes and separates them from user specific ones. If you have one database or one big data warehouse it may be irrelevant. But if server has 200+ databases it will be much less overhead to keep generic code tables into on separated database. In my case, 2-3 new databases created in server weekly. All of them have the same generic codes and it will be complete disaster try to synchronize them when changes necessary.





Leo Peysakhovich
Leo Peysakhovich
Old Hand
Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)

Group: General Forum Members
Points: 340 Visits: 350

As you can see in article, real life scenario 2 is more complicated than the one table. It consists of 4 tables (in my case). I don't like codes usage and linkage to be done in the same table (recursive tables with unlimited number of levels). Based on 4 tables you DO NOT HAVE MORE THAN 1 LEVEL of dependency. And codes are separated by groups. Plus, I store ONLY GENERIC codes in the structure. The codes that are not company specific. Company specific codes are in their own tables based on scenario 1. This allows reuse generic codes and separates them from user specific ones. If you have one database or one big data warehouse it may be irrelevant. But if server has 200+ databases it will be much less overhead to keep generic code tables into on separated database. In my case, 2-3 new databases created in server weekly. All of them have the same generic codes and it will be complete disaster try to synchronize them when changes necessary.





Leo Peysakhovich
Leo Peysakhovich
Old Hand
Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)

Group: General Forum Members
Points: 340 Visits: 350

As you can see from real live example (scenario 2) code table have id as integer. Group and type tables are very small (50-100 records each). And in a lot of cases if group code is COUN then description COUNTRY is not really necessary. It makes description unnecessary for many cases. Plus with such small number of rows performance is not suffering at all.





Leo Peysakhovich
Leo Peysakhovich
Old Hand
Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)Old Hand (340 reputation)

Group: General Forum Members
Points: 340 Visits: 350

As you can see in article, real life scenario 2 is more complicated than the one table. It consists of 4 tables (in my case). Based on 4 tables you DO NOT HAVE MORE THAN 1 LEVEL of dependency. And codes are separated by groups. Plus, I store ONLY GENERIC codes in the structure. The codes that are not company specific. Company specific codes are in their own tables based on scenario 1. This allows reuse generic codes and separates them from user specific ones. If you have one database or one big data warehouse it may be irrelevant. But if server has 200+ databases it will be much less overhead to keep generic code tables into on separated database. In my case, 2-3 new databases created in server weekly. All of them have the same generic codes and it will be complete "madness" try to synchronize them when changes necessary.

Each design should be choosen carefully for each specific case! As always each one have pluses and minuses. It is mean choose ideas carefully for you database scenario.





N Cook
N Cook
SSC-Enthusiastic
SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)SSC-Enthusiastic (145 reputation)

Group: General Forum Members
Points: 145 Visits: 22

Peter,

Can you explain a little more what you mean when you say that a simple function call is much better than a join to do a code lookup? Are you speaking of a user define function in sql server? The reason I ask is that I'm still learning of course (always) and so I'm just curious.

EDIT: Ok I just now saw you're second post and I see that you are talking about a user defined function. Now how exactly would you use a UDF with a lookup table? Do you have an example?

Thanks,
Nate


publicbutler@imapmail.org
publicbutler@imapmail.org
Valued Member
Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)

Group: General Forum Members
Points: 63 Visits: 101
This is one of the worst practices from design and performance standpoints.

First, design-wise, why give up foreign keys? As mentioned in an ealier post, that is the WHOLE POINT of RDBMS's. That is Celko's issue with consolidating codes into a single table. Second, performance-wise, does any RDBMS engine process trigger code as efficiently as its built in routines for checking foreign key validity? The answer is no.

Top priority has to be the integrity of the data. It is not ease of adminstration.

Make the right choice, design a better database. Then, look for a way to maintain your codes better. Maybe, a central code database that publishes updates to many databases with common structures is a better approach as it does not compromise the database design.

- Jay
roger clarke
roger clarke
SSC Rookie
SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)SSC Rookie (41 reputation)

Group: General Forum Members
Points: 41 Visits: 90
There is a point of view (well actually mine) which says that you should not have any lookup tables at all.
The reason being that every description in a lookup table should be available in the 150+ languages of the planet ... let the GUI do the translating!
publicbutler@imapmail.org
publicbutler@imapmail.org
Valued Member
Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)Valued Member (63 reputation)

Group: General Forum Members
Points: 63 Visits: 101
Why wouldn't you place the language translations in tables?

Do you really trust that every application running against your database is going to decoding correctly? For the same reasons, you would not trust the applications to enforce referential integrity, I would not trust them to decode data expressions.

- Jay
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search