RE: Second Normal Form – SQLServerCentral

SSCrazy Eights

Points: 9526

July 17, 2011 at 12:59 pm

Tom.Thomson (7/14/2011)
That's sort of true. But Occam's razor is a useful tool when trying to present things in a simple manner, and what a functional dependency is all about is that some values are fixed by some other values; at least I hope that's what you mean by a functional dependency (it's what every academic paper I've read that contained the term used it for).

I think it is important to be clear on definitions given what you said in your previous article about 1NF permitting nulls. If a reader mistakenly thinks that 2NF and other normal forms apply equally well to a table with nulls as to a table without them then they may want to know what a functional dependency means in a table with nulls. The concept of functional dependency was developed without considering the impact of nulls and I think that pretty much every other text I've read on the topic excludes the possibility of nulls being part of any FD. You have opened the door to nulls in your first article but now you seem to be ignoring the consequences.

It's important to remember that Codd and Heath were discussing NULL (not two different NULLs, just one) way back when they were preparing their early papers on normalisation, so that it is extremely unlikely that NULLs weren't considered at all.

I will take Codd's word for it when he says "The concepts and rules of functional dependence, multi-valued dependence, and join dependence were all developed without considering missing db-values. All the normal forms based on these dependencies were also developed without considering missing dbvalues."[1]. Whether or not nulls were actually in his mind at the time, he chose to leave them out of what he published. Codd goes on to say:

"the normalization concepts do NOT apply, and should NOT be applied, globally to those combinations of attributes and tuples containing [null] marks. Instead, they should be applied [...] to a conceptual version of the database in which tuples containing missing-but-applicable information in the pertinent attributes have been removed"

[1] Missing Informatlon (Applicable and Inapplicable) in Relational Databases, SIGMOD RECORD, Vol. 15, No. 4, December 1986

So despite what you say, in Codd's view 2NF does exclude nulls. You haven't said in your article that 2NF applies only to a version of a database without nulls but you haven't suggested any alternative way of evaluating 2NF for a database with nulls either.

I don't mean to imply that I agree wtih Codd's view. I think it's a terrible suggestion. It completely ignores the very real possibility that the kind of redundancy and resulting anomalies which the NFs were designed to eliminate can also occur when some rows include nulls. What bothers me is that you didn't mention either this or any other strategy for dealing with the nulls that you say you want to permit. Your readers are left in the dark about it.

Just a fortnight ago one of you anti-NULL purists was claiming that there was no such term as "primary key" in relational theory, only in SQL! (Complete nonsense, as you will agree.)
I think most people will understand that I used "the primary key" in the sense used in the paper which introduced the relational model to the world - it is a primary key that has been chosen to be "the" primary key as opposed to just "a" primary key and it is a perfectly sensible description of a candidate key to say that it is something that can be so chosen, both because that's why the word "candidate" is there - a candidate is something that can be chosen (whether by fate, by a schema designer, or otherwise) and since the term "candidate key" was in fact introduced as a replacement for "primary key" rather later. The concept of superkey is of course completely superfluous.

Sure, but the science and the language have moved on since Codd used the term "primary key". Almost no-one today uses that phrase to refer to any or all keys of a relation and using it to mean just one key is completely redundant in a discussion about normalization.

If the person you mentioned was confused about primary keys then I'm not surprised and he's not the only one. In the previous 1NF article discussion you said "SQL has a restriction that there can be only one primary key". Taken literally that statement is obviously untrue. By expecting us to guess each time whether you mean "the primary key" or "a primary key" you are committing the same linguistic folly that Codd did. Please let's leave "primary" keys out of a discussion about normal forms.