A Lookup table strategy defined.
Most databases designs nowadays seem to have at least a few if
not many lookup or reference tables. (I’ll use these two terms
interchangeably) These tables are those small tables that you maintain
your list of States, or CustomerTypes or JobStatus or any number of valid
domain values used to maintain data integrity within your application.
These reference tables usually have simple 2 -4 columns with the naming
convention usually following along the lines of ID, Value, and Description, and
maybe Active. (e.g. CustomerTypeID, CustomerTypeValue, CustomerTypeDesc,
CustomerTypeActive) I have seen database designs that have hundreds of
these reference tables.
There is nothing wrong with the mere existence of these
tables, but they do bring some baggage along with them. One of the
considerations that happens when you have these tables is that someone has to
design and approve them. Someone then has to design, code, and approve
any necessary views, and stored procedures around them. And most of these
tables, views, and stored procedures are fairly simple. There’s usually
very little insert, update, or delete (IUD) activity happening. They’re
mostly used for lookups and for joins in views to represent the entire picture
of a record. In a large application you can also clutter up your list of
tables with so many of these that you begin to think that you need to have a
special naming convention for objects that are lookup related. (e.g.
lkpCustomerType, lkpStates, slkp_GetCustomerType, etc).
All of the previous issues that I presented were from the DBA
or database developer’s perspective, but there’s another perspective to take
into consideration. The application developer’s perspective.
Whether it’s a traditional client server or internet application, the
application developer has to usually create separate functions to access &
modify the data within each table, often creating separate classes to represent
each table. Then the developer needs to create an interface for the user
to maintain the values within these tables. This naturally makes more
work for the developer.
I’ve created a lookup architecture that simplifies things a
bit. This is not necessarily a brand new concept, but it is something
I’ve rarely seen. What I’ve done is to create two tables. I call
them Look and LookType. The structure is shown in Figure 1
Figure 1.

Before I go any further, let me first explain another design
and naming convention that I have. All tables that I design have a field
that is unique and is named after the table with the suffix of GID (Generated
ID). This value is usually an IDENTITY integer although can sometimes be
a uniqueidentifier. This field is not necessarily the Primary Key,
although in this instance it is. My other convention is that all Foreign
Key fields have the suffix FID (Foreign ID). This field doesn’t
necessarily have to have the same name as the Primary Key it references, but
usually ends up that way. So that explains the LookTypeGID, LookGID, and
LookTypeFID fields. Each of the GID fields are IDENTITY integer fields
and are also the Primary Key. The LookTypeFID field is the foreign key to
the LookTypeGID field. The other convention that I have is that all
foreign key values in the tables that point to the Look table have the LID
(Lookup ID) suffix. This makes it easier for me to at a glance realize
where things are related to. The main fields are the Value fields.
These are normally where the reference value is stored. There is also a
description field which can be used for longer and more descriptive
descriptions of the value. On both tables there is also an Active field
which can be used to either inactivate a single value in the list or an entire
list. The
LookOrder field is used solely for display or sorting
purposes. In lookup lists, there isn’t a standard way of sorting
things. Usually somebody wants things sorted a particular way besides
alphabetical, numerical, etc. This field is for that and is of integer
data type. The other important field is the Constant field. This is
a place where you can put an easier to remember value to reference that row
from your application. You don’t want to go hard coding distinct values
into your application code such as SELECT * FROM Customers WHERE
CustomerTypeLID = 2. The reason that this is bad is 1) it has no meaning
without doing a little more work, 2) moving development data to production or
any number of events can reset your IDENTITY values and screw your code up, and
3) The client usually doesn’t have any control over the ID’s, so if he wants to
change what the code is pointing to, he has to edit either the code of go into
the database at the table level.
Now you’re either a tad bit confused or you’re possibly saying
“so what”? Well, first let’s put a couple of values in the tables so that
you can see an example of what I’m talking about.
Sample LookType data.

Sample Look data

Now, from a lookup perspective, all of the values will come
from the Look table. The only table within the database that would
reference the LookType table would be the Look table. Its sole purpose is
to create a grouping identifier for the Look table values. So we can see
that our List of Shippers has a LookTypeGID of 37 and has 3 shippers in
it. We use the constant value of SHIPPER_UPS etc to identify within the
application which row we’re referencing. A sample Order table would then
have an integer field called ShipperLID with a possible value of 112 for
UPS. If I wanted to get the list of shippers I’d call one of my stored
procedures like “EXEC s_LookListByTypeConst ‘SHIPPERS’, NULL” (NULL
representing my Active field. I can either get all of the active records,
or all the records no matter whether active or not. It defaults to Null
which here means only Active)
Now, I know that there are probably a number of you who
immediately see that this design breaks the 1st form of
normalization. I contend that there are always exceptions to the rule
based upon applicability of the situation.
With this design, you never need to create new lookup or
reference tables. You just need to add data to preexisting tables.
That then leads us to the next and most valuable aspect of this design.
We can make generic procedures that allow us to do anything and everything with
these tables with a small list of stored procedures or functions. These
stored procedures or functions can then be used for all application
development. This is where it now gets interesting. I do
mostly web front end applications for my data marts. I’ve created a
single asp page that has access to a VBScript class which access the stored
procedures. This page then allows the application users or managers to
manage their own lookup lists. Gone are the days when the application
manager asked me to add the new CustomerType of Wholesale or add an entire
lookup table and the corresponding stored procedures, or to sort the
ProjectStatus of Open to the top of the list and Closed at the bottom.
Here’s a list of the stored procedures that I use.
You’ll get the gist of their meanings since I’m fairly verbose with my object
names. I’ve also included a couple samples.
s_LookAddEdit
s_LookTypeAddEdit
s_LookDelete
s_LookTypeDelete
s_LookListByGID
s_LookListByConst
s_LookListByTypeGID
s_LookListByTypeConst
s_LookTypeList
s_LookGIDByConst
s_LookTypeGIDByConst
s_LookValueByConst
CREATE PROCEDURE s_LookValueByConst
(
@const varchar(100),
@active int = NULL,
@value varchar(1000) OUT
)
AS
SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
BEGIN TRAN
SELECT @value = LookValue
FROM Look
WHERE LookConst = @const
AND (@active IS NULL OR LookActive = @active)
COMMIT TRAN
GO
CREATE PROCEDURE s_LookListByTypeConst
(
@const varchar(100),
@active int = NULL
)
AS
SET NOCOUNT ON
DECLARE @id int
EXEC s_LookTypeGIDByConst @const, NULL, @id OUT
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
BEGIN TRAN
SELECT *
FROM Look
WHERE LookTypeFID = @id
AND (@active IS NULL OR LookActive = @active)
ORDER BY LookOrder
COMMIT TRAN
GO
I also have one view called v_Look which combines the two
tables and all of the values.
You’ll notice that I set the transaction isolation levels to
read uncommitted. I do this since these are fairly popular tables
used in many different views and sometimes used in the same view more than
once. These tables are also fairly static and so speed is my main concern
here.
Now realize that there are considerations and some
requirements in order to implement this design.
1) All
of your value & description fields need to be the same data type, most
normally varchar. You can obviously store numbers or dates in the value
fields, but they’ll need to be cast correctly on the application for input and
output.
2) You
need to have unique LookConst values.
3) You
should have check constraints on tables that reference the Look table so that
you can validate that only allowable values are put into that field.
(e.g. there would be nothing invalid about putting the value of 30 into the
ShipperLID field in the Order table. Unfortunately, that would then mean
that the Shipper was “December”.
4) All
data access code should come through the stored procedures.
5) This
design currently does not take into consideration of different security rights
for different reference domains. (e.g If Mary can change CustomerType
values, but not Shipper values.) This has to currently be done at the
application level.
6) This
design is limited in that if you have a domain of values that require more than
the designed fields, it doesn’t work very well and you’ll be better off making
an individual table for it.
7) In
some of my applications I have added CreatedDate & CreatedBy &
ModifiedDate & ModifiedBy fields for auditing purposes.
8) I
found this design to work very well for those application level setting tables
or application / DTS global variable tables.
I have slowly over the years refined this module of
functionality to include such things as reordering the display order if a user
changes, inserts, or deletes a Look record, creating stored procedures to
return not only the value but also the description (sometimes a user may want
to see a drop down list of State abbreviations, other times the entire State
name), and am now working on the addition of functions to replicate a lot of
the stored procedures so that a developer could use the function in a SELECT
query simply.
I hope this gets you to thinking on how to reduce the number
of reference tables in your databases and if you have any feedback, please let
me know. I’m always interested in hearing other developers’ thoughts.
David Sumlin