RE: Second Normal Form – SQLServerCentral

SSC Guru

Points: 104773

July 14, 2011 at 5:05 am

David Portas (7/14/2011)
The author's point about A being "fixed by" the values of X presumably means a functional dependency of the form X->A. Functional dependencies are the basis of 2NF.

That's sort of true. But Occam's razor is a useful tool when trying to present things in a simple manner, and what a functional dependency is all about is that some values are fixed by some other values; at least I hope that's what you mean by a functional dependency (it's what every academic paper I've read that contained the term used it for).

If I were intending to write about formal theory rather that something useful for practical database people, I would of course treat FDs as a special case of MVFDs (an FD is an MVFD where the range value set is a singleton) and then for 2NF (and 3NF too) discuss only a special case of FDs (where the range domain is a 1-tuple). I don't think that would be much use to most of the SQLServerCentral audience, who are practitioners not theorists.

It 's important to remember that the concept of functional dependency was developed without regard for nulls and it is an assumption of 2NF (and 3NF, BCNF, etc) that relations consist only of values and not nulls.

It's important to remember that Codd and Heath were discussing NULL (not two different NULLs, just one) way back when they were preparing their early papers on normalisation, so that it is extremely unlikely that NULLs weren't considered at all. They didn't work out the effect of nulls on normalisation at that point - which is why Codd later wrote that their introduction had meant that the normalisation had had to be revisited - but certainly there are no grounds now, three decades after NULLs were first proposed, for claiming that relational theory has no room for them or that 2NF, 3NF, BCNF and etc (presumably "etc" means EKNF, 4NF, various 5NFs, DKNF and Date's 6NF) exclude them (for example, if 5NF had been intended to exclude NULL, the paper introducing it would have said so - its author, Ron Fagin, was working with Codd and had just reviewed Codd's RM-T and "...Capture more Meaning" papers for him, so he certainly knew all about NULLs. For 6NF, I imagine Date did intend that NULL be forbidden - but that's the only one I believe does exclude NULL.

The definition of a candidate key as a set of columns that "could reasonably be used as the primary key" is a bit strange. A candidate key is quite precisely a minimal superkey. There is no difference at all between a candidate key and a primary key (or even "the" primary key) so I'm not sure what is meant by a "reasonable choice" in that context. It's a pity the explanation of candidate keys isn't spelt out more clearly because it is very often a point of confusion for some people.

Just a fortnight ago one of you anti-NULL purists was claiming that there was no such term as "primary key" in relational theory, only in SQL! (Complete nonsense, as you will agree.)

I think most people will understand that I used "the primary key" in the sense used in the paper which introduced the relational model to the world - it is a primary key that has been chosen to be "the" primary key as opposed to just "a" primary key and it is a perfectly sensible description of a candidate key to say that it is something that can be so chosen, both because that's why the word "candidate" is there - a candidate is something that can be chosen (whether by fate, by a schema designer, or otherwise) and since the term "candidate key" was in fact introduced as a replacement for "primary key" rather later. The concept of superkey is of course completely superfluous.

Tom