Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««123

Second Normal Form Expand / Collapse
Author
Message
Posted Saturday, July 16, 2011 12:36 AM


Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Thursday, April 4, 2013 5:00 AM
Points: 11, Visits: 28
Look at Esteban Zimányi's 2006 paper "Temporal Aggregates and Temporal Universal Quantification in Standard SQL" (available on pages 16 to 25 of this pdf doc, or get hold of the book "Temporal Data and the Relational Model" by Date, Darwen and Lorentzos if you are heavily into data warehousing with history requirements. For a brief discussion the normal form level of some table types that may be useful there is this but be warned it's aimed at showing that their chosen table forms are all in 6NF, rather than showing how to do anything useful with them; but if you like it, or the tables it describes, you may want to read up on anchor modelling more generally.


Thanks a lot Mr. Thomson...
I have learned some new things about database design like Temporal Data and 6th Normal Form. I have downloaded your suggested papers and also I managed to get the Date's Book "Temporal Data and the Relational Model". I have started reading all about temporal data and I am very excited to know this new terminology (at least for me it is new). Thanks a lot to add some new things into my knowledge horizon :)


Fazal Khan
C.T.O
TARECO SOFTONICS
Post #1142904
Posted Sunday, July 17, 2011 12:59 PM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Today @ 4:44 AM
Points: 449, Visits: 3,391
Tom.Thomson (7/14/2011)
That's sort of true. But Occam's razor is a useful tool when trying to present things in a simple manner, and what a functional dependency is all about is that some values are fixed by some other values; at least I hope that's what you mean by a functional dependency (it's what every academic paper I've read that contained the term used it for).

I think it is important to be clear on definitions given what you said in your previous article about 1NF permitting nulls. If a reader mistakenly thinks that 2NF and other normal forms apply equally well to a table with nulls as to a table without them then they may want to know what a functional dependency means in a table with nulls. The concept of functional dependency was developed without considering the impact of nulls and I think that pretty much every other text I've read on the topic excludes the possibility of nulls being part of any FD. You have opened the door to nulls in your first article but now you seem to be ignoring the consequences.

It's important to remember that Codd and Heath were discussing NULL (not two different NULLs, just one) way back when they were preparing their early papers on normalisation, so that it is extremely unlikely that NULLs weren't considered at all.

I will take Codd's word for it when he says "The concepts and rules of functional dependence, multi-valued dependence, and join dependence were all developed without considering missing db-values. All the normal forms based on these dependencies were also developed without considering missing dbvalues."[1]. Whether or not nulls were actually in his mind at the time, he chose to leave them out of what he published. Codd goes on to say:


"the normalization concepts do NOT apply, and should NOT be applied, globally to those combinations of attributes and tuples containing [null] marks. Instead, they should be applied [...] to a conceptual version of the database in which tuples containing missing-but-applicable information in the pertinent attributes have been removed"


[1] Missing Informatlon (Applicable and Inapplicable) in Relational Databases, SIGMOD RECORD, Vol. 15, No. 4, December 1986

So despite what you say, in Codd's view 2NF does exclude nulls. You haven't said in your article that 2NF applies only to a version of a database without nulls but you haven't suggested any alternative way of evaluating 2NF for a database with nulls either.

I don't mean to imply that I agree wtih Codd's view. I think it's a terrible suggestion. It completely ignores the very real possibility that the kind of redundancy and resulting anomalies which the NFs were designed to eliminate can also occur when some rows include nulls. What bothers me is that you didn't mention either this or any other strategy for dealing with the nulls that you say you want to permit. Your readers are left in the dark about it.

Just a fortnight ago one of you anti-NULL purists was claiming that there was no such term as "primary key" in relational theory, only in SQL! (Complete nonsense, as you will agree.)

I think most people will understand that I used "the primary key" in the sense used in the paper which introduced the relational model to the world - it is a primary key that has been chosen to be "the" primary key as opposed to just "a" primary key and it is a perfectly sensible description of a candidate key to say that it is something that can be so chosen, both because that's why the word "candidate" is there - a candidate is something that can be chosen (whether by fate, by a schema designer, or otherwise) and since the term "candidate key" was in fact introduced as a replacement for "primary key" rather later. The concept of superkey is of course completely superfluous.

Sure, but the science and the language have moved on since Codd used the term "primary key". Almost no-one today uses that phrase to refer to any or all keys of a relation and using it to mean just one key is completely redundant in a discussion about normalization.

If the person you mentioned was confused about primary keys then I'm not surprised and he's not the only one. In the previous 1NF article discussion you said "SQL has a restriction that there can be only one primary key". Taken literally that statement is obviously untrue. By expecting us to guess each time whether you mean "the primary key" or "a primary key" you are committing the same linguistic folly that Codd did. Please let's leave "primary" keys out of a discussion about normal forms.


David
Post #1143124
Posted Sunday, July 17, 2011 2:22 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Sunday, October 19, 2014 11:12 AM
Points: 7,791, Visits: 9,545
David Portas (7/17/2011)
I think it is important to be clear on definitions given what you said in your previous article about 1NF permitting nulls. If a reader mistakenly thinks that 2NF and other normal forms apply equally well to a table with nulls as to a table without them then they may want to know what a functional dependency means in a table with nulls.

Actually I think it is far more important to explain the fundamental underlying concept - that the business rules so constrain values that knowing values for some attributes guarantees that you also know the values for some others - than to mess about with the mathematical abstraction "functional dependency", which has no meaning at all to almost everyone involved in database development. That underlying concept is what matters in the real world of database design. You may be one of those people who think better in terms of mathematical abstractions than in terms of the real world phenomena they model - if so, you are not part of the audience I was writing for. I don't want to tell my audience that they have to learn a lot of pure mathematics to understand normalisation (which is NOT part of relational theory, but part of real world database design). Nor do I want to arrive at a situation where my audience ends up, by the end of the article series, foolishly believing that maximum decomposition is more imprtant than accurate representation (so you'll really hate at least one of the later articles in the series). Nor do I want my audience to believe that normalisation can't be applied to their real-world databases where missing values are inevitable so they have NULLs, because that would deprive them of a valuable tool-kit for improving their schema designs. Perhaps you will understand my position better if I use an analogy: "look, we want to keep the baby - so be careful when you discard the bathwater"?

While "permit NULL only when it is absolutely necessary (and be extremely sure that it really is absolutely necessary)" is a good rule that every schema designer should follow, "forbid NULL everywhere" is nonsense perfectionism (at least until we have better ideas for NULL elimination - a research topic which anti-NULL purists have ignored ever since Date's bizarre "default values will always work in place of NULL" idea) of the same order as "make all your schemas conform to DKNF" (which is, as Date pointed out here, starting about 20 lines down, provably impossible in many practical cases; in fact we don't even approach understanding where it is possible).

I will take Codd's word for it when he says "The concepts and rules of functional dependence, multi-valued dependence, and join dependence were all developed without considering missing db-values. All the normal forms based on these dependencies were also developed without considering missing dbvalues."[1]. Whether or not nulls were actually in his mind at the time, he chose to leave them out of what he published. Codd's goes on to say:

"the normalization concepts do NOT apply, and should NOT be applied, globally to those combinations of attributes and tuples containing [null] marks. Instead, they should be applied [...] to a conceptual version of the database in which tuples containing missing-but-applicable information in the pertinent attributes have been removed"



I don't mean to imply that I agree wtih Codd's view. I think it's a terrible suggestion. It completely ignores the very real possibility that the kind of redundancy and resulting anomalies which the NFs were designed to eliminate can also occur when some rows include nulls. What bothers me is that you didn't mention either this or any other strategy for dealing with the nulls that you say you want to permit. Your readers are left in the dark about it.

Then what do you mean to imply? That I should provide about 16000 words of abstruse mathematics to demonstrate the several ways (one of which is the way Codd mentioned and you apparently failed to notice in the quotation you used - you apparently also failed to notice that the informal approach in the article exactly followed that way except for an insistence on banning NULL in all prime attributes, which I think Codd probably intended although he didn't actually mention it that sentence) in which the formal theory can be adapted, instead of sticking to the actual design issues that the formal theory is intended to deal with and model? When I was very young I might have taken that approach - I was a pure mathematician - but since then I have become an applied mathematican/computer scientist/engineer and gained experience by applying and teaching people to apply mathematics in industry; this has taught me that going back to the real world things that are being modelled and talking about them is where I need to be if I want to teach effectively. It has also taught me that I understand the mathematics much better and have far more insight into the abstract theory if I approach it that practical way.

Anyway, I don't think it's useful to try to hijack this discussion (as you have successfully hijacked others) into a debate on whether NULL is a bad thing. We know it's a bad thing. That it's also a necessary thing is something that is unfortunate, and it's even more unfortunate that some people can't accept that we must live with that necessity.


Tom
Post #1143135
« Prev Topic | Next Topic »

Add to briefcase «««123

Permissions Expand / Collapse