Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

FK vs Check Constraint for Lookup/Reference Tables Expand / Collapse
Author
Message
Posted Wednesday, August 28, 2013 4:40 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Today @ 8:05 AM
Points: 14,802, Visits: 27,276
Simplest suggestion I can make here is to let the relational system work like a relational system. Use the tools provided because they are a part of how SQL Server has been written and optimized. 5, 8, or even 20 table JOINs are no big deal if the queries are written correctly, constraints are in place and enforced, you have indexes, and all the statistics are up to date. I've seen 86 table JOINs work just fine (well, apart from the 3 minute compile time). Don't sweat that stuff. Focus on core principals.

----------------------------------------------------
"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood..." Theodore Roosevelt
The Scary DBA
Author of: SQL Server 2012 Query Performance Tuning
SQL Server 2008 Query Performance Tuning Distilled
and
SQL Server Execution Plans

Product Evangelist for Red Gate Software
Post #1489113
Posted Wednesday, August 28, 2013 11:03 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Friday, February 07, 2014 6:33 AM
Points: 496, Visits: 815
Thank you for the insight. As always this site is very informative thanks to everyone that contributes.



Post #1489335
Posted Friday, October 04, 2013 4:37 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Yesterday @ 2:06 PM
Points: 37, Visits: 176
I'm consulting at a company with a developer-designed database which has a very large table (over 2 billion rows) that has a number of small char or varchar categorical columns. Currently, there are no check constraints or foreign keys pointing to look-up tables. I'm wondering what's the best approach to enforce integrity at the DB level rather than at the app level.

Consider the case where the categorical column is a char(1) and appears only in the subject table. It seems ridiculous to create a lookup table if the only valid values are 'M' or 'F', and given that the meaning of these abbreviations are so well known, a check constraint certainly seems better than a two row lookup table and an FK. Going one step further, there can be a case where there are 15 valid char(1) codes. Here the case for a lookup table is a bit better, since the lookup table could include a description column for clarity. We could also change the PK/FK data type to tinyint and not increase the size of the subject table. I wonder if there is any performance benefit in replacing the char(1) with a tinyint when joining.

Then there are marginal cases where the categorical char column is larger but not too large. In these cases, there would be a minor size benefit from replacing the char columns with tinyint FKs pointing to a lookup table. But if the number of alternatives was still small (e.g. Male and Female) so clarity was not an issue, I wonder whether a check constraint would perform better than the FK and lookup table.

For the case where the categorical character variable is quite large relative to the size of the smallest integer-type key value, it seems clear that the FK and lookup table is best from a size perspective alone. And lookup tables are preferred when there are many valid values and their presence adds clarity or when they could be used by multiple tables. Check constraints have the benefit of reducing the number of tiny lookup tables. But it seems they're only good for a 'small' number of 'small' sized categorical alternatives that appear only in one table.

Has anyone done any timing comparisons on this issue?



Post #1501784
Posted Thursday, January 09, 2014 5:12 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, March 12, 2014 8:37 AM
Points: 1, Visits: 29
This is a great discussion. I share similar views as JRoughgarden. But I wonder even on a small number of values that
never change how do you achieve clarity when using CHECK constraint. Let's use JRoughgarden sex column with check constraint - two values 'M' and 'F' are allowed and its pretty clear what sex is a person when someone looks at the row from this table. But then comes Joe Celko with his sex column: the ISO sex codes are {0= unknown, 1= male, 2= female, 9= lawful persons}. I think he always uses CHECK constraint and not a lookup table for this case. On the front end you can use enumeration with description (i.e. front end written in java) to get what sex is someone. But if I go to the database and see values 0,1,2,9 in a row I think I will not know what they mean - how do you achieve clarity for cases like this? Do you say go see ISO sex codes on google or open front end application and search the enumeration to find code descriptions??

regards
Post #1529296
Posted Thursday, January 09, 2014 6:26 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Yesterday @ 3:51 PM
Points: 1,261, Visits: 2,292

Traditional Model:
tEmployees (EmployeeID int,StatusID int)
tEmployeesStatus (StatusID int,StatusDescription (varchar(20))

Proposed Model:
tEmployees (EmployeeID int,StatusDescription varchar(20))
tEmployeesStatus (StatusDescription varchar(20))

Create a UDF that validates the data being entered into the StatusDescription field:

Create FUNCTION [dbo].[fnEmployeeStatusList](@EmployeeStatus varchar(20))
RETURNS char(1)
AS
BEGIN

Declare @Exists char(1)

SELECT @Exists = [StatusDescription ] from tEmployeeStatus Where [StatusDescription ] = @EmployeeStatus
IF @Exists > '' BEGIN SET @Exists = 'Y' END
IF ISNULL(@Exists,'') = '' BEGIN SET @Exists = 'N' END
-- Return the result of the function
RETURN @Exists

END

ALTER TABLE [dbo].[tEmployees] WITH CHECK ADD CONSTRAINT [CK_tEmployees_Status]CHECK (([dbo].[fnEmployeeStatusList]([EmployeeStatus])='Y'))
GO

ALTER TABLE [dbo].[tEmployees CHECK CONSTRAINT [CK_tEmployees_Status]
GO


I am not sure if I understand this proposed model correctly. In the proposed model, i don't see any importance of even creating the table tEmployeesStatus as StatusDescription is the only column in this table & this StatusDescription has already been written to the table tEmployees.


- Divine Flame
Post #1529312
Posted Thursday, January 09, 2014 6:38 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Friday, February 07, 2014 6:33 AM
Points: 496, Visits: 815
It's been awhile since I posted this but I believe the second table tEmployeeStatus was used to maintain the status list so statuses could be added from a UI. You don't to have to have that table as the list could be maintained directly in the constraint but of course you lose some flexibility in regards to the above requirements.
Post #1529319
Posted Friday, January 10, 2014 12:59 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 8:37 AM
Points: 8,287, Visits: 8,738
Divine Flame (1/9/2014)
I am not sure if I understand this proposed model correctly. In the proposed model, i don't see any importance of even creating the table tEmployeesStatus as StatusDescription is the only column in this table & this StatusDescription has already been written to the table tEmployees.

No, inserting a row into tEmployees with a staus that isn't already in tEmployeesStatus fails because the check function will not return 'Y' so that row would violate the check constraint. The tEmployeesStatus table has to be maintained separately. This is reasonable if it doesn't cause a big space problem - so if there are not more than about a few hundred distinct statuses, and not too many employees (ie it only makes sense when there isn't much data).

However, the whole thing is abominably badly written; probably needs NOT NULL on all the column definitions, in tEmployeesStatus the single column should be declared as the primary key, in tEmployees something should be the primary key, and that UDF should be much shorter, everything between BEGIN and END (not inclusive) could be replaced by a single return(select...) statement using CASE and IN.


Tom
Post #1529949
Posted Friday, January 10, 2014 1:09 PM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Friday, February 07, 2014 6:33 AM
Points: 496, Visits: 815
It was only written to help convey the idea behind the question. It's not production code.
Post #1529955
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse