﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / Article Discussions / Article Discussions by Author / Discuss content posted by Tom Thomson  / First Normal Form / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Sun, 26 May 2013 01:18:46 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>I finally read this article (and amazingly, all 70 posts...something I NEVER do).  Very good article and like several others have pointed out your responses in these discussions are exemplary.  Tom you are truly "a gentleman and a scholar".  I'm going to read the 2NF and 3NF articles now.  Thank you for writing.</description><pubDate>Thu, 28 Jul 2011 09:24:38 GMT</pubDate><dc:creator>thisisfutile</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Tom.Thomson (7/15/2011)[/b][hr]That only works if logic is restricted to black and white (true and false, no other possibilities).  Constructive mathematicians work in a world where there are infinitely many shades of grey.  Even non-constructivists are happy to use all sorts of MVLs instead of classical two valued logic (the difference between them and constructivists is that the constructivists refuse to use the 2VL black-and-white logic, because they (or we - I'm not admitting to be on either side of that debate) believe it just doesn't work).  Relational Database theoreticians have a similar division - anti-MVL and pro-MVL; it's kind of sad that in the Relational Database world they don't respect each other's positions, unlike the situation in mathematics.[/quote]Fair enough ;-) I can appreciate the "shades of grey" stance.  For practical purposes, a tolerance for "shades of grey" is generally the more flexible approach, and more in line with "common sense" and real life practices, etc... as it seems we can rarely ever make a clear distinction about real world affairs that we might then attempt to model.  I guess the question there though would be: Is it because there is no distinction, or is it because we just can't see it?For the sake of debate (which I love) let me play devil's advocate and posit these notions and see how far down the rabbit hole we can go:- Anything outside and/or BOTTOM/TOP of the domain of "true or false" is either inapplicable to true or false or unknown/unknowable as true or false and therefore cannot be evaluated as such.  In this example, the other values outside true and false, or the grey, is akin to the mind filling in the blanks of an incomplete picture, and not actually an implication that a BOTTOM or TOP actually exists or that there is actually anything outside that domain.  The grey here is only an approximation on the unknown, unknowable, or the seemingly inapplicable.  Once known, knowable, or applicable though... it becomes either true or false.  That is a limitation on "the knower", not true/false.- As for any known "value" that is seemingly "grey", seemingly not quite true or false - as opposed to chalking this up to a multi-valued system of logic, one could instead attribute this to a layer of complexity (contrived or otherwise) on top of a fundamental true/false framework, but still essentially a subset of that framework.  That's not to say there's anything wrong with going for complexity.  The world can be a pretty complex place, so having a complex system of logic can help with understanding interactions at higher orders of magnitude, without the need to evaluate each element individually (i.e. - When counting on your fingers, the finger is only ever up or down, either there is/true/up or there is not/false/down, but since we only have a limited number of fingers to work with... decimal system).- Any known value perceived to be somewhere either in between true and false or other than, and not one or the other, is actually not one value (simple/atomic), but only represented as one value as a result of higher levels of complexity on top of the fundamental.  At it's core though, it is a collection of true and false values, which if decomposed would yield only the two values, regardless of how infinitely or infinitesimally true or false they might be.I'm sure I have violated one or more mathematical principles somewhere along the way, so feel free to school me ;-) I'm more of a philosopher than a mathematician any way :cool:This is what happens when I can't sleep... :hehe:</description><pubDate>Sat, 16 Jul 2011 02:00:33 GMT</pubDate><dc:creator>pjcrout</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]pjcrout (7/15/2011)[/b][hr]Tom,Firstly, great article.  I actually stumbled onto it after reading your 2NF article via the link back here, and wanted to start from the beginning.  Glad I did ;-)After reading, I was like "Wow... Good times..."[/quote]Thanks for that - every comment like that encourages me to keep on writing.[quote]Then, after reading the comments, I was like "Aww... Bad Times..."I will echo others in saying: Good job defending you stance.  The scrutiny has been, at times, excessive (I thought), but I think you really made some great points both in your article and your replies.  Honestly, coming from an App Dev background into this realm of DBAs, the article definitely helped put the concept of normal forms under a more focused lens for me (all theoretical nuance squabbling aside).[/quote]I don't think anything was actually excessive - it's all fair comment and debate.  I'm glad though to hear that you thought I did a good job in defending my position.  I came from an mathematician and app developer background into database research, and after a lot of other things (largely not DB related) into doing the DBA thing as well as much else (a couple of "you name it, I do it" jobs), which I think is very like coming from a developer background into the DBA realm so we probably have a lot of ideas in common. Anyway,  I am trying to write for practical DBAs who got into that role accidentally (as I did) and not for the theorists. I do find it an advantage to know some of the theoretical nuances too, though - it save a lot of trouble when designing a schema.[quote]I also liked how you addressed the concept of [b]context[/b] as it relates to "atomicity".  So many data elements that we take for granted as being "simple" in nature rarely ever are (one nit that went unpicked: Full Name... atomic? yes/no/may-be-so), and are in fact themselves hierarchical arrays of data carried over from "real life" attempts at bringing a measure of order to The Chaos.  I definitely think it was an important point to raise.  I think using terms like "atomic" and "natural" in all this lingo tends to get us thinking to abstract and we forget that we're not actually talking about the indestructible, elemental entities of the cosmos.[/quote]Ted Codd, who introduced the atomicity idea into relational database theory (which he invented, of course) didn't use the term "atomic" in his early writingss. Instead he said "simple".  I like that term (indeed I prefer it to atomic, because atomic doesn't properly reflect his insistence - which I absolutely believe was correct - on sticking to a first order relational calculus) but I don't use it because sticking to "atomic" usually gets me into fewer silly arguments (I'm very lazy, so avoiding pointless arguments is one of my prime objectives).[quote]These data elements are, more often than not, just a bunch of non-sense concocted by a band of numb-skulls (and I do not exclude myself from being a subset of that domain) :-P[/quote]Join the club - I'm definitely a member of that domain:-D[quote]Premise: Humans are imprecise, inaccurate, and to some degree unpredictable.  For that reason, they are incapable of existing within a "perfect" system, because it would be too rigid, leaving no tolerance for error (which humans generate with great proficiency).Dilemma:- For the system to function properly it should be perfect     &amp;gt; (i.e. - in perfect adherence with "the rules".  Note: Rules created by who? Exactly.)- If the system is perfect it doesn't workSummary:As [i]The Architect[/i] put it, "The problem ... is choice."  Or to put it a different way (and I think more accurately): The problem is human error.  Because in the world of 1 and 0, true and false, right and wrong... Choice is an illusion, and in the end you are either right or wrong.  In adherence or in error.[/quote]That only works if logic is restricted to black and white (true and false, no other possibilities).  Constructive mathematicians work in a world where there are infinitely many shades of grey.  Even non-constructivists are happy to use all sorts of MVLs instead of classical two valued logic (the difference between them and constructivists is that the constructivists refuse to use the 2VL black-and-white logic, because they (or we - I'm not admitting to be on either side of that debate) believe it just doesn't work).  Relational Database theoreticians have a similar division - anti-MVL and pro-MVL; it's kind of sad that in the Relational Database world they don't respect each other's positions, unlike the situation in mathematics.</description><pubDate>Fri, 15 Jul 2011 19:06:57 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>Tom,Firstly, great article.  I actually stumbled onto it after reading your 2NF article via the link back here, and wanted to start from the beginning.  Glad I did ;-)After reading, I was like "Wow... Good times..."Then, after reading the comments, I was like "Aww... Bad Times..."I will echo others in saying: Good job defending you stance.  The scrutiny has been, at times, excessive (I thought), but I think you really made some great points both in your article and your replies.  Honestly, coming from an App Dev background into this realm of DBAs, the article definitely helped put the concept of normal forms under a more focused lens for me (all theoretical nuance squabbling aside).I also liked how you addressed the concept of [b]context[/b] as it relates to "atomicity".  So many data elements that we take for granted as being "simple" in nature rarely ever are (one nit that went unpicked: Full Name... atomic? yes/no/may-be-so), and are in fact themselves hierarchical arrays of data carried over from "real life" attempts at bringing a measure of order to The Chaos.  I definitely think it was an important point to raise.  I think using terms like "atomic" and "natural" in all this lingo tends to get us thinking to abstract and we forget that we're not actually talking about the indestructible, elemental entities of the cosmos.  These data elements are, more often than not, just a bunch of non-sense concocted by a band of numb-skulls (and I do not exclude myself from being a subset of that domain) :-PI think you touched on a great point, which really drives it home for me:[quote][b]Tom.Thomson (6/30/2011)[/b][hr]...a structure that is not future-proof is not a good structure.  When considering what the business rules are when determining what is 2NF, 3NF, EKNF, BCNG or 4NF one should consider what changes in business rules may happen next year, or the year after, or even later.  For 1NF one should consider what someone might produce as a new requirement on the database in the future, as a new requirement may make a list-valued attribute into a violation of 1NF.[/quote]And as most users here probably have come to realize, being cogs in the corporate machine, this one principle definitely holds true in the business world: "The only constant is change".  Ultimately, this is a good thing overall, I think.  Humans are inherently less than perfect, and therefore our attempts to define and organize our world will be equally so.  Our propensity for constant change gives us the ability to adapt to the unforeseen eventualities that will inevitably come due to our insufficient understanding of the world we seek to explain.This means that anything we build must facilitate (or at least account for), and not restrict this element of change.  A system that is too rigid will not hold up over time.All this puts me in a mind to do what I love to do most in these situations (as some may have come to realize from my previous posts)... Over-simplify a vast and complex issue by distilling it down to a quaint movie reference! :-DI call it "The Matrix Paradox" (I probably read it somewhere, so I'm sure I'm not the only one who calls it that... :-P ) ...Premise: Humans are imprecise, inaccurate, and to some degree unpredictable.  For that reason, they are incapable of existing within a "perfect" system, because it would be too rigid, leaving no tolerance for error (which humans generate with great proficiency).Dilemma:- For the system to function properly it should be perfect     &amp;gt; (i.e. - in perfect adherence with "the rules".  Note: Rules created by who? Exactly.)- If the system is perfect it doesn't workSummary:As [i]The Architect[/i] put it, "The problem ... is choice."  Or to put it a different way (and I think more accurately): The problem is human error.  Because in the world of 1 and 0, true and false, right and wrong... Choice is an illusion, and in the end you are either right or wrong.  In adherence or in error.Train of Thought... derailed! :cool:</description><pubDate>Fri, 15 Jul 2011 12:37:08 GMT</pubDate><dc:creator>pjcrout</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]David Portas (7/5/2011)[/b][hr]Tom, I'm well aware that the relational model and TTM require relations that consist of atomic attributes. What I see as a problem is that your article uses another criterion for 1NF:[quote]Is this or isn't it in 1NF? [i][b]That depends on what the database is expected to do with it[/b][/i][/quote]You define 1NF based on intended usage. This doesn't make any sense to me. Suppose I create a relation variable using my RDBMS and call it T. T happens to have a string attribute that encodes multiple telephone numbers. That's OK because a string is by definition an atomic value. According to your definition T is in 1NF as long as "the database doesn't know anything about that encoding". I don't have any problem with that so far because I don't ever intend to split out the phone numbers in the database. T is in 1NF.Later on however, I change my mind. I want to split out the phone numbers using some code in my database. As you put it: "There will have to be code in queries to decode that complex string". Now you say that T is not in 1NF. But it's the same relation variable as before! The atomicity of the string and the structure of the relation has not changed. Just because I intend to use it differently T cannot suddenly stop being a relation. So your suggested criteria is an [i]additional[/i] requirement for 1NF over and above the requirement for T to be a relation - a relation being a thing which is defined by structure alone and never by intended usage. According to you, the relation variable T can change from moment to moment from being in 1NF to not being in 1NF depending on what code exists in my database.[/quote]I think you are missing the point -it's not about precluding the database from holding certain objects, it's about ensuring that the objects it holds have [i]so far as the database is concerned[/i] only types which are directly supported by the relational calculus or algebra language made available by the database system.  Not an issue of abstruse theory, a really simple and straightforwards practical issue.It's not the decision to have the object in the database that breaks 1NF - it's the decision to have the object in the database and require it to be manipulated by the database language (relational algebra or calculus with the interfaces of supported types) in a manner that requires understanding of its structure without including support for its type in the database language. And don't try to tell me that the type of a list of telephone numbers is string - because correct typing precludes nonsense like allowing 'XvMZQ54(w@^%' as list of telephone numbers.If you modify the database system to recognise a type "phone number list" and add the operations that implement this type (append number to list, delete number from list, is number in list, is list empty, create empty list, merge two lists, and so on) then you are still in 1NF.  If instead to decide to write those longhand in the database language using the primitives for operating on lists of characters (strings) provided by your database then you are not in 1NF because the object is not atomic in the database.</description><pubDate>Thu, 14 Jul 2011 03:54:10 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]SanDroid (7/5/2011)[/b][hr]For anyone that want's to read what Ted Codd wrote about NULL (SQL):[url]http://en.wikipedia.org/wiki/Null_(SQL)[/url]  I know it is wikipedia, but so what if it is correct.[/quote]It's a pretty good reference.  Some trivial errors (for example the two distinct null values were introduced more than a decade before the RM V2 book, not in it.). Most importantly it points out that NULL indicates the absence of a value for the datum, not the truth value "unknown".[quote]For anyone that want's to read what Codd &amp; Date wrote about  1NF:[url]http://www2.yk.psu.edu/~lxn/IST_210/normal_form_definitions.html[/url][/quote]It's certainly not what Codd wrote about normal forms.  It may be something like some of what CJD write (I haven't read all his writings), but referring everything to THE primary key instead of to the set of primary keys (usually now called candidate keys) is not at all Codd's style.[quote]It is the RDBMS Ternary Value theory Codd introduced that most people seem to confuse the most.  Codd never said NULL = Unknown.[quote]The Unknown truth value is generated whenever Null is [i][b]compared[/b][/i] with any data value, or with another Null.[/quote] [/quote]That's another of those errors - sometimes comparing NULL with NULL delivers True (Paul's clever use of INTERSECT mentioned above explicitly depends on this), and TC stated that it should deliver True in some cases (cases listed in his SIGMOD 1979 paper, and the Dec 79 ToDS version).  Wikipedia even mentions these particular comparisons further down the page.If you read the Dec 79 ToDs paper (it's available somewhere online without requiring an ACM library subscription, but I'm not telling you where because that would be encouraging copyright violation :angry: - as well as encouraging TC's view that the research belongs to everyone ;-)) you will see that TC had introduced the two NULLs and 4VL system before then, but thought it reasonable to merge the NULLs and stick to 3VL for many purposes.  If you read the 1990 book, in particlular the rules for replacing NULLs with actual values (you can download thbe book from the ACM library for $15, if I recall correctly) you'll see that the i and a NULLs aren't actually "inapplicable" and "applicable" since that would have precluded convertions between them.</description><pubDate>Thu, 07 Jul 2011 06:07:32 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]SanDroid (7/5/2011)[/b][hr]To me It reads like two or more people are re-writing/debating the original RDBMS theory material. Trying to fit things Chris Date or Ted Codd wrote about to there own personal thoughts or ideas. I am not a big fan. People come here to learn. They should not be given the personal thoughts abouts someone elses ideas as a fact. That is misinformation.[/quote]I agree!  SanDroid for President!  :-)</description><pubDate>Tue, 05 Jul 2011 17:44:27 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]SQLkiwi (7/4/2011)[/b][hr][quote][b]James Goodwin (7/1/2011)[/b][hr]It would be nice if there was a more clean SQL syntax than (X is NULL OR X &amp;lt;&amp;gt; 'Value'), but that is implementation, not theory.[/quote]There is an alternative to that, which I use quite a lot:[url]http://sqlblog.com/blogs/paul_white/archive/2011/06/22/undocumented-query-plans-equality-comparisons.aspx[/url][/quote]Nice blog post, Paul. Never thought of trying that route. One more trick in my bag, thanks!</description><pubDate>Tue, 05 Jul 2011 16:23:40 GMT</pubDate><dc:creator>Jan Van der Eecken</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]SQLkiwi (7/5/2011)[/b][hr][quote][b]David Portas (7/5/2011)[/b][hr]...But it's the same relation variable as before! The atomicity of the string and the structure of the relation has not changed. Just because I intend to use it differently T cannot suddenly stop being a relation...[/quote]Is it just me, or does this sound like a quantum theory of normal forms? :laugh:[/quote]To me It reads like two or more people are re-writing/debating the original RDBMS theory material. Trying to fit  things Chris Date or Ted Codd wrote about to there own personal thoughts or ideas. I am not a big fan. People come here to learn.  They should not be given the personal thoughts abouts someone elses ideas as a fact.  That is misinformation.An actual Quote from Chris Date: "There was a general feeling at the time that research belonged to everyone. It was public. People at IBM didn’t like it, but all the papers were available. Ellison felt it was obvious that Codd was onto something, so he built a clone of what Ted outlined"If you want to keep writing about what you "think" Codd &amp; Date meant please read some of the things they wrote yourself before posting.  Try to stay away from what Oracle or IBM re-wrote later.  Post some live links to that factual information along with your thoughts.  Since we have several examples in this thread of how not to do that.  Here are some examples of how to do it: For anyone that want's to read what Ted Codd wrote about NULL (SQL):[url]http://en.wikipedia.org/wiki/Null_(SQL)[/url]  I know it is wikipedia, but so what if it is correct.For anyone that want's to read what Codd &amp; Date wrote about  1NF:[url]http://www2.yk.psu.edu/~lxn/IST_210/normal_form_definitions.html[/url]FYI: The main definition for 1NF in the link above has the word DOMAIN in it.  This is how Ted and later Chris both saw it. FYI: Chris Date did not work with Ted Codd on the original definition of NULL. Chris came into the picture later.It is the RDBMS Ternary Value theory Codd introduced that most people seem to confuse the most.  Codd never said NULL = Unknown.[quote]The Unknown truth value is generated whenever Null is [i][b]compared[/b][/i] with any data value, or with another Null.[/quote] Codd saw the need for differant levels of NULL in a RDMS to store and return the lack of a value properly.  In 1990 Codd wrote in The Relational Model for Database Management version 2 [quote]"The single Null mandated by the SQL standard is inadequate, and should be replaced by two separate Null-type markers to indicate the reason why data is missing. These two Null-type markers are commonly referred to as 'A-Values' and 'I-Values', representing 'Missing But Applicable' and 'Missing But Inapplicable', respectively."[/quote]NULL, just like the mostly empty void we call the universe, is a large subject.:cool:</description><pubDate>Tue, 05 Jul 2011 09:03:45 GMT</pubDate><dc:creator>SanDroid</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]David Portas (7/5/2011)[/b][hr]...But it's the same relation variable as before! The atomicity of the string and the structure of the relation has not changed. Just because I intend to use it differently T cannot suddenly stop being a relation...[/quote]Is it just me, or does this sound like a quantum theory of normal forms? :laugh:</description><pubDate>Tue, 05 Jul 2011 07:23:27 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>Tom, I'm well aware that the relational model and TTM require relations that consist of atomic attributes. What I see as a problem is that your article uses another criterion for 1NF:[quote]Is this or isn't it in 1NF? [i][b]That depends on what the database is expected to do with it[/b][/i][/quote]You define 1NF based on intended usage. This doesn't make any sense to me. Suppose I create a relation variable using my RDBMS and call it T. T happens to have a string attribute that encodes multiple telephone numbers. That's OK because a string is by definition an atomic value. According to your definition T is in 1NF as long as "the database doesn't know anything about that encoding". I don't have any problem with that so far because I don't ever intend to split out the phone numbers in the database. T is in 1NF.Later on however, I change my mind. I want to split out the phone numbers using some code in my database. As you put it: "There will have to be code in queries to decode that complex string". Now you say that T is not in 1NF. But it's the same relation variable as before! The atomicity of the string and the structure of the relation has not changed. Just because I intend to use it differently T cannot suddenly stop being a relation. So your suggested criteria is an [i]additional[/i] requirement for 1NF over and above the requirement for T to be a relation - a relation being a thing which is defined by structure alone and never by intended usage. According to you, the relation variable T can change from moment to moment from being in 1NF to not being in 1NF depending on what code exists in my database.The relations which according to your definition aren't in 1NF could also still satisfy higher normal forms like BCNF or 5NF even though you say they violate 1NF. If I've misunderstood your criterion for deciding 1NF then I think you need to clarify this point. To me it is fundamental that 1NF means nothing more than the definition of a relation, which is simply a data structure in a database. 1NF does not depend in any way on the intended usage of the data but purely on its structure.[quote][b]Tom.Thomson (7/1/2011)[/b][hr]If NULL is a type of value (which is certainly the domain-theoretic interpretation of it)[/quote]However Codd says that Null marks are not values and Date and pretty much everyone else agrees. Codd never used the the domain theory definition of a domain in any work of his that I'm aware of. In fact he specifically requires that null marks be independent of domain - by which I understand him to mean they are orthogonal to the concept of a domain. If you are putting forward a different definition of nulls and null behaviour (as you have done in another SSC thread) then you ought to make it clear that your 1NF definition isn't generally applicable to other systems using different things called "nulls".</description><pubDate>Tue, 05 Jul 2011 05:57:01 GMT</pubDate><dc:creator>sqlvogel</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]James Goodwin (7/1/2011)...Regarding Nulls, I haven't followed the theoretical literature enough to notice that there is a raging debate here, but from a practical standpoint I have noticed that people getting started in SQL tend to get burned by NULL and then avoid it for a while.  Then they start doing OUTER joins and are forced to become more familiar with accommodating NULLs.  It would be nice if there was a more clean SQL syntax than (X is NULL OR X &amp;lt;&amp;gt; 'Value'), but that is implementation, not theory.--JimFive[/quote]Speaking of NULL implementations, when I first started coding in Oracle SQL, the biggest snag I ran into was that an empty string '' is NULL. [code="other"]'Smith' &amp;gt; ''   : False'' = ''   : False'' is null   : True[/code]</description><pubDate>Mon, 04 Jul 2011 09:01:00 GMT</pubDate><dc:creator>Eric M Russell</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]James Goodwin (7/1/2011)[/b][hr]It would be nice if there was a more clean SQL syntax than (X is NULL OR X &amp;lt;&amp;gt; 'Value'), but that is implementation, not theory.[/quote]There is an alternative to that, which I use quite a lot:[url]http://sqlblog.com/blogs/paul_white/archive/2011/06/22/undocumented-query-plans-equality-comparisons.aspx[/url]</description><pubDate>Mon, 04 Jul 2011 07:54:35 GMT</pubDate><dc:creator>Paul White</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Tom.Thomson (7/2/2011)[/b][hr]Actually, I wouldn't go so far as to call any of the comments so far unprofessional.[/quote]There was one in particular (not directed at you) that seemed it would derail this thread (as so many threads have been in the past) as I read through it.  You and some of the other good folks did such a good job at keeping things on track that it's no wonder that you have missed it.  Like I said, very well done.</description><pubDate>Sat, 02 Jul 2011 14:18:09 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Jeff Moden (7/2/2011)[/b][hr]Tom,I share JJ B's sentiments.  Whether I agree with the content of the article or not (and, considering the zealous and sometimes unprofessional retorts in this discussion by others, is a fact that shall remain a mystery to all but me for I am not as well prepared to defend a position on the subject as you :-P ), I very much appreciate the extremely professional manner in which you're dealing with sometimes hostile and/or opinionated posts.I'll also state that you've brought some concepts and perceptions to light through your fine article which I've not previously considered and that, too, is very much appreciated.Very well done on all accounts, Sir.  I very much look forward to your upcoming articles on this hotly contested subject.[/quote]Jeff,Thank you very much for your comments.  As you know, your opinion is one that I greatly respect, so your kind remarks are very welcome, very much appreciated.  To know that I've brought new concepts and perceptions to your attention makes me very proud.Actually, I wouldn't go so far as to call any of the comments so far unprofessional. Some are certainly not well thought out, but there my response is of course aimed at enabling people to think things out more clearly - and I'm not sufficiently foolish to believe that all my opinions are correct, so I'm perfectly happy for people to challenge them as long as they do it without introducing personal attacks, which I don't think have been introduced in the comments here.</description><pubDate>Sat, 02 Jul 2011 10:50:38 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>Tom,I share JJ B's sentiments.  Whether I agree with the content of the article or not (and, considering the zealous and sometimes unprofessional retorts in this discussion by others, is a fact that shall remain a mystery to all but me for I am not as well prepared to defend a position on the subject as you :-P ), I very much appreciate the extremely professional manner in which you're dealing with sometimes hostile and/or opinionated posts.I'll also state that you've brought some concepts and perceptions to light through your fine article which I've not previously considered and that, too, is very much appreciated.Very well done on all accounts, Sir.  I very much look forward to your upcoming articles on this hotly contested subject.</description><pubDate>Sat, 02 Jul 2011 10:06:11 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>A couple of people have suggested that the normalised forms of the tables used as examples should have been included in the article (some in comments here, others in personal messages and email).  I think that’s a pretty fair comment on the article, so here I have provided (a) SQL Code to generate the tables normalised to 1NF assuming that the structures of the two messy fields do have to be understood by the database system, (b) the rows that would be in those new tables to hold the data contained in the rows shown in the original un-normalised example, and (c) a couple of extra comments that people may find useful.Here is the SQL code:[code]CREATE TABLE Employee (    PersonName nvarchar(64) NOT Null Primary Key,    OfficePhone varchar(18) NOT Null, -- everyone has an office phone    HomePhone varchar(18) NOT Null, -- everyone has a home phone    )CREATE TABLE FaxPhone (    PersonName nvarchar(64) NOT Null Primary Key,    FaxNumber varchar(18) NOT Null,    CONSTRAINT FK_FaxPhone_Person Foreign Key         (PersonName) references Employee(PersonName),    )CREATE TABLE Spouse (    PersonName nvarchar(64) NOT Null Primary Key,    SpouseName nvarchar(64) NOT Null,    SpouseDoB date NOT Null,    Constraint FK_Spouse_Employee  Foreign Key        (PersonName) references Employee(PersonName),    )[/code]And here are the representations of the original two rows in the three tables of the normalised version.[code][b]table Employee[/b]	[b]PersonName  	  OfficePhone	HomePhone[/b]	Bill	          121 643 4321	121 641 3159	Frank	          121 643 4324	119 861 8183[b]table FaxPhone[/b]	[b]PersonName	FaxNumber[/b]	Bill	        121 643 4497	Frank	        121 643 4497[b]Table Spouse[/b]	[b]PersonName	SpouseName	SpouseDoB[/b]	Bill	        Mary	        19951108[/code]An important extra point here: the the SpouseDoB column uses the [i]date[/i] type not varchar because of course another part of normalisation to 1NF was to ensure that this date is held as an atomic item whose interface is available in the databases language; here SQL is assumed, and the language provides an interface for the [i]date[/i] type which allows the year, month, and day components of the atomic date object to be obtained as well as allowing easy comparison and equality testing for dates.  The display form I used above is of course (one of) the international standard(s), and is fully supported as an input format by SQL.Another extra point: if enough employees share fax numbers (as might be expected from the two shown) it might be a useful optimisation to have a separate table for Fax numbers allowing a surrogate primary key to be assigned, thus saving a few bytes per row in the FaxPhone table (the FaxNumber column in the FaxPhone table would be replaced by an integer – maybe even a tinyint - FaxNumberId column).  This is not part of normalisation for the structure shown above, but could be a useful optimisation if the properties of the real data were such that it was more efficient than the schema shown above.  This does sometimes happen, for example when a company provides only one or two Fax numbers for an office block occupied by more than a thousand employees, but this seems to be pretty unusual in the real world so doing this will often deliver worse performance than not doing it, although there are those who claim it is always an optimisation and never a pessimisation.</description><pubDate>Sat, 02 Jul 2011 09:49:57 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>In addition to the phone number and zip code decomposition I'd like to mention another is-it-atomic data: email address.  You can't really do much without the whole thing in one piece.  However if you store the username in one field and the domain in another you get the benefit of tight index scans finding all the users at a particular domain (assuming you have the domain indexed).Similarly the storage of IP addresses is pretty straightforward, right?  varchar(15) is sufficient for period-separated 3 digit strings.  Use hex and we only need varchar(11).  Or is it 4 tinyint columns?  Wait, isn't that the same as an int?  You mean we can index every IP address on the IPv4 Internet with a single Int field?  Yup.  :)  Maybe it wouldn't be the most human-readable form but it would be pretty efficient in both space-use and indexability.  You can see that the only sure answer is the often quoted "it depends."</description><pubDate>Fri, 01 Jul 2011 20:58:44 GMT</pubDate><dc:creator>Mike Dougherty-384281</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>Tom, I think that overall this is a good primer on first normal form. You brought up the point that what constitutes an atomic value depends on the context with which it is used within the database. For example, even zip codes and phone numbers can potentially be deconstructed into separate key values, and the same for phone and even SSN. However, for the majority of databases (outside the USPS and AT&amp;T), these codes are not decomposed by either the database or the application, so they can be considered atomic within that context.zip_code--------477224 = Indiana77 = Vanderburgh County22 = University of EvansvilleYou also pointed out that relations between something like a customer and their phone numbers(s) or spouse(s) would be best contained in separate table. Philosophical debates aside, these points, in addition to having primary key (preferably non-surrogate), are what's important to get across when introducing 1NF.</description><pubDate>Fri, 01 Jul 2011 12:17:34 GMT</pubDate><dc:creator>Eric M Russell</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>Now for the serious answer to the interesting points in David's post.[quote][b]David Portas (7/1/2011)[/b][hr]All the normal forms are concerned with dependencies satisfied by relations (or relation variables). If 1NF is understood to be exactly equivalent to the definition of a relation then 2NF, 3NF, BCNF, etc certainly also imply 1NF, just because relations are the only things that satisfy those normal forms.[/quote]  Actually definitions of 2NF that state that the reation has to be in 1NF in order to be in 2NF are extremely common.  But in fact I agree with you: 1NF is essentially the property of being a relational database theory relation. [quote] However, if you understand ([i]mis[/i]understand in my view) 1NF to stipulate other conditions over and above the basic requirement that a schema consists only of relations then the higher normal forms don't have to satisfy 1NF.[/quote]I'm going to base what follows on the idea that you have picked this idea up from reading the 3rd manifesto (or reading about it).Actually, I think you have misunderstood the third manifesto here.  Data and Darwen explicitly retain the atomicity principle.  Unfortunately they thought they understood type systems, so didn't bother to learn how to describe type systems that permit existential types, and believed that CLOS's intent for a firm separation of the implementation of a method from its interface definition meant that no part of the method was part of an instantiation of the type and that this would be generalisable to all possible type systems, so they make a complete pigs breakfast of suggesting how support for existential types (if I remember rightly they don't even call them existential types, so even a type theorist can have trouble seeing what they are trying to get at) and unless things have changed in another edition of the book they have no concrete proposal at all for a type system, just a few general principles for which there is no certainty that they will actually work and which seem to imply that the type system will be undecidable (not necessarily a bad thing, in fact I was trying to go there in the late 80s but I came to the conclusion that it was too dangerous; but point someone like Cardelli at it and something good might eventually come out).  The point here is that despite their confused and somewhat incoherent verbiage about the type system, they retain the principle of atomicity: the only domains allowed are those all of whose interfaces are made available in the language that the database system uses for it's relational calculus.[quote]For example if you think that 1NF disallows certain types of value in a relation then that's something that isn't required by any other NF. [/quote]You are wrong.  All normal forms disallow non-atomic values.  Atomic values are those all of whose interfaces are directly implied by their type, which is available in the database system, and may have pretty complex structures (just as atoms in physics have pretty complex structures of electrons, protons, neutrons, gluons,... but are still atomic).[quote]Or if you think 1NF means a relation must have a "primary" key and that "primary" means something other than a candidate key then you'd also be defining something that was not required by any other NF.[/quote]I hope no-one thionks that "a primary key" means something other than "a candidate key"; of course "a primary key" is the traditional terminology that has been used in relational theory for 41 years, and "a candidate key" was presumably introduced because lingustically challenged people couldn't understand the difference between "a primary key" and "the primary key".[quote]This is why I don't think it's productive to say that 1NF disallows certain types of value - even complex values like XML that the DBMS may later deconstruct into other values; or relation values; or other "exotic" objects like multimedia. Tom's article suggests a criteria for excluding certain types of value from 1NF based on how you intend to use them in the database. I disagree. If it is useful to put some value in a relation and potentially subject it to the same dependencies as any other relation then it's perfectly reasonable to want to apply the principles of normal forms to it. In Codd's day databases were concerned with numbers and strings and little else. That's not the world we are in today. As soon as we start placing limits on the type of data values a relation is "allowed" to include then we weaken the relational model and make it less useful.[/quote]I guess we will continue to disagree then.  But a definition of 1NF based on the third manifesto is not going to be any use at all to anyone working with (or planning to work with) real live databases, who are the target audience for the article, and that's why I'm sticking with the original definition.Another restriction that Codd insisted on was that the relational calculus was a first order calculus: a relation could not be the value of an attribute.  The third manifesto has ditched this - and that ensures that no relational calculus for a 3rd manifesto database can be based on first order predicate calculus (:hehe: that's the sort of mess you get into when you let a Cambridge mathematician, like Date, mess about with a nice clean theory developed by an Oxford man, like Codd :-D).  It will have to use higher order predicate calculus.  Ouch!  I'll stick with tradition and continue to assert that 1NF explicitly excludes that possibility.  Of course given that an RDB-theoretical relation is a function on time to equivalence classes (so that column order can be permuted) of set theoretic maps (since you have to know which attribute is which somehow, and attribute order is not significant so you can't use a mathematical relation) on a fixed bag (not a set, since the elements are named) of named domains I find it a bit difficult to understand just what was intended by that 3rd manifesto statement.</description><pubDate>Fri, 01 Jul 2011 11:57:48 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]David Portas (7/1/2011)[/b][hr]For example if you think that 1NF disallows certain types of value in a relation then that's something that isn't required by any other NF. [/quote]I think we need a &amp;lt;joke&amp;gt;....&amp;lt;ekoj&amp;gt; delimiter in this forum, but I guess I can do it with smiley variants.Oh David, I hope you realise what you are saying! :hehe: :hehe: :hehe: :w00t:If NULL is a type of value (which is certainly the domain-theoretic interpretation of it) then none of the normal forms (apart from possibly 1NF) can ban it!  :-D :-) :-D A relation can be in 5NF and allow NULLs ( :Wow: presumably even in prime attributes, which even I would never accept) because you can't ban values in the higher normal forms?  You appear to have gone from solidly anti-null to extremely :alien: pro-NULL!Have you really come over to the side of the just and ditched the anti-null nonsense? :-P :cool:It's a pity that you didn't really mean it.  </description><pubDate>Fri, 01 Jul 2011 11:05:49 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>Thanks Tom.  This is a topic that needs to be covered time and again.  Good work.</description><pubDate>Fri, 01 Jul 2011 10:20:56 GMT</pubDate><dc:creator>SQLRNNR</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]James Goodwin (7/1/2011)[/b][hr]When I first read this article I was surprised by the statement that having a list in a column could be 1NF if the database was never asked to process that list.  This seems to fly in the face of how normalization is taught.However, after thinking about it I understand the point.  If, for example, I have a table that stores user settings that looks like:USER          SETTINGS----------------------------BOB            ASDFGHJKLTOM           QWERTYUIThe fact that each character in the settings column might mean something to the application doesn't affect the database at all.  This is, in fact, no different than the phone number list, it just looks different.[/quote]You know, we could take what you said even further... What happens to a phone number when the government decides to change phone area codes?  Isn't a phone number simply a 3 part identifier.  Area Code, Locale, and Phone Line identifier (names may vary).  So shouldn't we create a whole table that is nothing but area codes, and a table for the locale, and a table for the phone line identifiers?  Or should we just leave a single phone number?  If you go to the length of decomposing the phone number inside the DB, you need to split it out.  If you only store the value and place no meaning on it, you don't care what it is.</description><pubDate>Fri, 01 Jul 2011 08:06:34 GMT</pubDate><dc:creator>venoym</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>When I first read this article I was surprised by the statement that having a list in a column could be 1NF if the database was never asked to process that list.  This seems to fly in the face of how normalization is taught.However, after thinking about it I understand the point.  If, for example, I have a table that stores user settings that looks like:USER          SETTINGS----------------------------BOB            ASDFGHJKLTOM           QWERTYUIThe fact that each character in the settings column might mean something to the application doesn't affect the database at all.  This is, in fact, no different than the phone number list, it just looks different.Regarding Primary Keys (Candidate keys) it might help people to think that the word Primary here is referencing Prime, not First.  The Primary Key is not the "most import candidate key"  All Candidate keys are Primary:  They are a relatively prime (having no duplication) set of columns in the relation.  The use of the term Primary Key for the "Most important Index" in SQL Engines is an unfortunate ambiguity.Regarding Nulls, I haven't followed the theoretical literature enough to notice that there is a raging debate here, but from a practical standpoint I have noticed that people getting started in SQL tend to get burned by NULL and then avoid it for a while.  Then they start doing OUTER joins and are forced to become more familiar with accommodating NULLs.  It would be nice if there was a more clean SQL syntax than (X is NULL OR X &amp;lt;&amp;gt; 'Value'), but that is implementation, not theory.--JimFive</description><pubDate>Fri, 01 Jul 2011 07:14:05 GMT</pubDate><dc:creator>James Goodwin</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Tom.Thomson (6/30/2011)[/b][hr][quote][b]KaplanDBA (6/30/2011)[/b][hr]Setting aside the fact that I'd rather normalize this further than just 1NF, I want to understand your argument better:[quote]Well, NULL should always mean "The database doesn't contain this value", so it would be used in the case where the value is applicable but unknown for one reason or another as well as in the case where there is no value because it's not applicable; if you want to be able to tell for certain that the value is inapplicable, you can't use NULL unless you can guarantee that there are no cases where the value is applicable but unknown.[/quote]What you're suggesting is that in the case where "Person A: Cannot possibly have a phone number" it would be better to use a garbage value like 000-000-0000,but in the case where "Person A: Could have a phone number but we don't know it" you suggest to use NULL?[/quote]No, that's not at all what I'm suggesting.  See below.[quote]What is the fundamental reasoning behind that decision? Is it simply based off the original definitions of NULL from Codd et al.?[/quote]Codd's best known NULL definition in fact defines two distinct NULL values, and he uses a 4-valued logic instead of a 3-valued logic, and I wouldn't dream of using his 4VL for anything so I am certainly not using that definition!It's really a question of whether you want to distinguish various cases - and of whether the people who take the database over from you in the future will perhaps want to make distinctins that you currently believe are unimportant.  If you have allowed your schema to have some values that can be inapplicable, you need to have some way of expressing "this is inapplicable".  If you have a real world situation where some data may have to be entered before all the attributes are known, you need some way of expressing "we haven't put this value in the database".  You can't use NULL to represent both if you want to be able to distinguish the two cases - but if you don't want to distinguish the two cases and are sure that no-one in future will then of course you can use NULL for both (but it's a good idea if you do so to document it very clearly in very many places).  Since real world situations always seem to result in values being absent for one silly reason or another, it's a good idea to reserve NULL for that case - which means you need something other than NULL for the "known to be inapplicable" case.  That's really all there is to it - no amazing mathematical basis or complex logic, just something simple and obvious.Chris Date suggested that ALL nulls should be replaced by special values; I don't believe that's possible, but some people do.[/quote]Ahhh, ok. So this answers my original query regarding usage of NULLs vs. arbitrary values. Makes sense now. For my purposes though, I think I will stick with NULLs unless the specific situation calls for a distinction between unknown value and non-applicative value.Thanks for explaining and looking forward to next article (and the lively debate that is sure to follow!);-)- George</description><pubDate>Fri, 01 Jul 2011 05:52:22 GMT</pubDate><dc:creator>George H.</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Jan Van der Eecken (7/1/2011)[/b][hr]Just one question. When you say[quote][b]Tom.Thomson (6/30/2011)[/b]In my view nothing that isn't all of 2NF, 3NF, and EKNF [b]as well as 1NF[/b] is a good database structure...[/quote]are you implying that you can have a model that is in 2NF, 3NF etc that is not 1NF? I've always been taught that for something to me 2NF it does have to be 1NF in the first place. Or am I reading something into that sentence that isn't there?[/quote]All the normal forms are concerned with dependencies satisfied by relations (or relation variables). If 1NF is understood to be exactly equivalent to the definition of a relation then 2NF, 3NF, BCNF, etc certainly also imply 1NF, just because relations are the only things that satisfy those normal forms.However, if you understand ([i]mis[/i]understand in my view) 1NF to stipulate other conditions over and above the basic requirement that a schema consists only of relations then the higher normal forms don't have to satisfy 1NF.For example if you think that 1NF disallows certain types of value in a relation then that's something that isn't required by any other NF. Or if you think 1NF means a relation must have a "primary" key and that "primary" means something other than a candidate key then you'd also be defining something that was not required by any other NF.This is why I don't think it's productive to say that 1NF disallows certain types of value - even complex values like XML that the DBMS may later deconstruct into other values; or relation values; or other "exotic" objects like multimedia. Tom's article suggests a criteria for excluding certain types of value from 1NF based on how you intend to use them in the database. I disagree. If it is useful to put some value in a relation and potentially subject it to the same dependencies as any other relation then it's perfectly reasonable to want to apply the principles of normal forms to it. In Codd's day databases were concerned with numbers and strings and little else. That's not the world we are in today. As soon as we start placing limits on the type of data values a relation is "allowed" to include then we weaken the relational model and make it less useful.</description><pubDate>Fri, 01 Jul 2011 03:35:30 GMT</pubDate><dc:creator>sqlvogel</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Jan Van der Eecken (7/1/2011)[/b][hr]Hi Tom,Firstly, thanks for a great article, and an even better discussion following up from it. I'm really looking forward to parts 2 and 3.[/quote]Thanks. I'm glad you liked it.[quote]Just one question. When you say[quote][b]Tom.Thomson (6/30/2011)[/b]In my view nothing that isn't all of 2NF, 3NF, and EKNF [b]as well as 1NF[/b] is a good database structure...[/quote]are you implying that you can have a model that is in 2NF, 3NF etc that is not 1NF? I've always been taught that for something to me 2NF it does have to be 1NF in the first place. Or am I reading something into that sentence that isn't there?Regards,Jan[/quote]I didn't intend to suggest that somethiong could be in 2NF without being in 1NF, or in 3NF that isn't in 2NF - just that going to 1NF is generally not on its own going far enough to produce to produce a good structure, one has to do lots of other normalisation too.  I guess I should have phrased it better.</description><pubDate>Fri, 01 Jul 2011 02:59:35 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Tom.Thomson (6/30/2011)[/b][hr]...All of IMS, IDS, IDMS used the term "primary key" long before 1970...[/quote]Wow! Tom, that takes me back :Wow:IDMS was the first database I worked with :w00t:Nice article by by the way :-)</description><pubDate>Fri, 01 Jul 2011 02:29:02 GMT</pubDate><dc:creator>David Burrows</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>Hi Tom,Firstly, thanks for a great article, and an even better discussion following up from it. I'm really looking forward to parts 2 and 3.Just one question. When you say[quote][b]Tom.Thomson (6/30/2011)[/b]In my view nothing that isn't all of 2NF, 3NF, and EKNF [b]as well as 1NF[/b] is a good database structure...[/quote]are you implying that you can have a model that is in 2NF, 3NF etc that is not 1NF? I've always been taught that for something to me 2NF it does have to be 1NF in the first place. Or am I reading something into that sentence that isn't there?Regards,Jan</description><pubDate>Fri, 01 Jul 2011 01:59:40 GMT</pubDate><dc:creator>Jan Van der Eecken</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]SanDroid (6/30/2011)[/b][hr][quote][b]Eric M Russell (6/30/2011)[/b][hr]I never contain phone or address related columns in something like the primary customer table. #1 An address or phone number isn't an attribute or measure of the customer; the telephone company assigns it to us for the duration of our contract, so we only have a relationship with it.#2 Multiple customers may share the same address or phone number, and for many applications that's useful to know.#3 Virtually nobody has only one phone number these days. #4 There are usually additional contextual information like type, priority (primary, seconday, etc.), and also effective_from_date and effective_to_date that users need to know. [/quote]Everything you mention is a Normalization that is done after 1NF.Customer Phone number relations are used in many texts to explain the differant levels of Normalization.I bet Tom was not trying to imply this as good DB structure.  [/quote]That was a pretty safe bet!In what was meant to be a short and simple description of first normal form without any reference to other normal forms there was no hope of producing a good DB structure.  In my view nothing that isn't all of 2NF, 3NF, and EKNF as well as 1NF is a good database structure, and if you can get to BCNF or even 4NF without violating the representation principle a good structure will have those properties too.  Also, a structure that is not future-proof is not a good structure.  When considering what the business rules are when determining what is 2NF, 3NF, EKNF, BCNG or 4NF one should consider what changes in business rules may happen next year, or the year after, or even later.  For 1NF one should consider what someone might produce as a new requirement on the database in the future, as a new requirement may make a list-valued attribute into a violation of 1NF.</description><pubDate>Thu, 30 Jun 2011 18:59:47 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Steven993 (6/30/2011)[/b][hr]The only key in relational theory is the candidate key. Or just "key". There is no notion of "primary" which indeed come from SQL.[/quote]Indeed the notion of "primary" key does [i]not[/i] come from SQL.  It's first mention in relational theory is in Codd's internal IBM report, the first public reference in relational theory in his paper in CACM 13/6 (June 1970) both of which antedate SQL by several years.  In fact the term "primary key" antedates relational theory by a long long time.  All of IMS, IDS, IDMS used the term "primary key" long before 1970, and IDMSX was using it before SQL (or even SEQUEL) existed. I'm pretty sure that all the main hierarchical and network DBMSs used the term - but maybe SDC's TDMS was an exception (I have no experience of that DBMS). I think Celko may be correct in suggesting that the term originated with some sort of sequential file system and meant the key which determined the sequence, and the most probable candidate is ISAM (which certainly used the term).</description><pubDate>Thu, 30 Jun 2011 18:37:47 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]KaplanDBA (6/30/2011)[/b][hr]Setting aside the fact that I'd rather normalize this further than just 1NF, I want to understand your argument better:[quote]Well, NULL should always mean "The database doesn't contain this value", so it would be used in the case where the value is applicable but unknown for one reason or another as well as in the case where there is no value because it's not applicable; if you want to be able to tell for certain that the value is inapplicable, you can't use NULL unless you can guarantee that there are no cases where the value is applicable but unknown.[/quote]What you're suggesting is that in the case where "Person A: Cannot possibly have a phone number" it would be better to use a garbage value like 000-000-0000,but in the case where "Person A: Could have a phone number but we don't know it" you suggest to use NULL?[/quote]No, that's not at all what I'm suggesting.  See below.[quote]What is the fundamental reasoning behind that decision? Is it simply based off the original definitions of NULL from Codd et al.?[/quote]Codd's best known NULL definition in fact defines two distinct NULL values, and he uses a 4-valued logic instead of a 3-valued logic, and I wouldn't dream of using his 4VL for anything so I am certainly not using that definition!It's really a question of whether you want to distinguish various cases - and of whether the people who take the database over from you in the future will perhaps want to make distinctins that you currently believe are unimportant.  If you have allowed your schema to have some values that can be inapplicable, you need to have some way of expressing "this is inapplicable".  If you have a real world situation where some data may have to be entered before all the attributes are known, you need some way of expressing "we haven't put this value in the database".  You can't use NULL to represent both if you want to be able to distinguish the two cases - but if you don't want to distinguish the two cases and are sure that no-one in future will then of course you can use NULL for both (but it's a good idea if you do so to document it very clearly in very many places).  Since real world situations always seem to result in values being absent for one silly reason or another, it's a good idea to reserve NULL for that case - which means you need something other than NULL for the "known to be inapplicable" case.  That's really all there is to it - no amazing mathematical basis or complex logic, just something simple and obvious.Chris Date suggested that ALL nulls should be replaced by special values; I don't believe that's possible, but some people do.</description><pubDate>Thu, 30 Jun 2011 17:48:55 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]JJ B (6/30/2011)[/b][hr]I appreciated the article.  More so, I appreciated the author's replies in these discussions. Very informational and more so, a great example of professional conduct in response to criticism.  Thanks for being a good role model.  I look forward to reading your future articles.[/quote]Thank you very much for this comment.  A comment like this means a lot to any author, and is a real encouragement to try to maintain a high standard.</description><pubDate>Thu, 30 Jun 2011 17:23:18 GMT</pubDate><dc:creator>L' Eomot Inversé</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Eric M Russell (6/30/2011)[/b][hr]I never contain phone or address related columns in something like the primary customer table. #1 An address or phone number isn't an attribute or measure of the customer; the telephone company assigns it to us for the duration of our contract, so we only have a relationship with it.#2 Multiple customers may share the same address or phone number, and for many applications that's useful to know.#3 Virtually nobody has only one phone number these days. #4 There are usually additional contextual information like type, priority (primary, seconday, etc.), and also effective_from_date and effective_to_date that users need to know. [/quote]While all true, these are illustrative of the impetus for developing the normal forms other than 1NF so they make a nice place to start the discussion in his later articles should he so choose.  Ultimately, though, I think the main point he was trying to make here is that, if the data in the field has no overt relationship (in an organizational sense) to other elements of the database but simply represents a blob of data which is of importance to a user or application, etc., external to the database, then what it contains does not affect normalization, even if, in the real world whatever it represents is related to other real-world items which are represented as data elements in the database</description><pubDate>Thu, 30 Jun 2011 16:52:48 GMT</pubDate><dc:creator>lnoland</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Eric M Russell (6/30/2011)[/b][hr]Letting SQL Server implicitly do the datatype conversion won't always cut it. For example, is this April 3, 2002 or March 4, 2002? It matters because in one situation I came across, the DOB was manually entered by end users and Name + DOB + ZIP was used to key back to another database.------------4/3/2002Attempting to let SQL Server cast this value will cause the SQL query to fail with an error.------------June 31, 2002[/quote]You are right. I was just pointing out that these date conversions are not so bad when moving them to MS SQL datetime data type.  This data type is so well defined that implicit or explicit conversions from valid string values IMHO happen with an ease that cause me to love, not hate them.  It makes the "magic" happen easy.I do hate trying to convert the JDE julian date format stored on any  RDBMS to any date time datatype.  Would love the prince that came up with that custom data type to have read about 1NF.I love the wings cause that's the fly part.  -- Annonymous Boeing Engineer --</description><pubDate>Thu, 30 Jun 2011 15:14:11 GMT</pubDate><dc:creator>SanDroid</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]SanDroid (6/30/2011)[/b][hr][quote][b]Eric M Russell (6/30/2011)[/b][hr][quote][b]SanDroid (6/30/2011)[/b][hr][quote][b]Eric M Russell (6/30/2011)[/b][hr]I never contain phone or address related columns in something like the primary customer table. #1 An address or phone number isn't an attribute or measure of the customer; the telephone company assigns it to us for the duration of our contract, so we only have a relationship with it.#2 Multiple customers may share the same address or phone number, and for many applications that's useful to know.#3 Virtually nobody has only one phone number these days. #4 There are usually additional contextual information like type, priority (primary, seconday, etc.), and also effective_from_date and effective_to_date that users need to know. [/quote]Everything you mention is a Normalization that is done after 1NF.Customer Phone number relations are used in many texts to explain the differant levels of Normalization.I bet Tom was not trying to imply this as good DB structure.  [/quote]Right that would actually be 2nd normal form, I think. As for a common 1NF violation, and this is probably at the top of my gripe list, is containing "date/time" values in a varchar column, especially when it's inserted by multiple applications and there is no check constraint on the format or consensus about what format to use. For example, a request comes in to modify an ETL extract from a legacy database that's soon to be retired. Filter out patients less than 18 years old. OK, it's a Friday afternoon, and I'm hoping I can knock it out in an hour. However, I then look at the table and see this nonsense. Got to hate that.birthdate------------04/15/20073/3/2004jan 2, 2010[/quote]Weird... Have we worked for the same people? I had the same issue with an Enterprise ETL application once. After three failed deployments by the offshore Dev team my manager begged for help. Took me 20 minutes on a friday to re-do the whole thingredoSIS, debug the location of the only two dates MS SQL could not do an implicit conversion for, get those updated from the lame UI that allowed the entry, and got home early.  All three of these examples can be implicitly converted from string to date by SQL server.  Test it.[code="sql"]create table testdate(birthdate datetime);Insert Into testdateSelect '04/15/2007' birthdate unionSelect '3/3/2004' birthdate unionSelect 'jan 2, 2010' birthdate;select * from testdate;drop table testdate;[/code]I can't hate what MS SQL has made easier for me to work with.[/quote]Letting SQL Server implicitly do the datatype conversion won't always cut it. For example, is this April 3, 2002 or March 4, 2002? It matters because in one situation I came across, the DOB was manually entered by end users and Name + DOB + ZIP was used to key back to another database.------------4/3/2002Attempting to let SQL Server cast this value will cause the SQL query to fail with an error.------------June 31, 2002</description><pubDate>Thu, 30 Jun 2011 14:45:51 GMT</pubDate><dc:creator>Eric M Russell</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Eric M Russell (6/30/2011)[/b][hr][quote][b]SanDroid (6/30/2011)[/b][hr][quote][b]Eric M Russell (6/30/2011)[/b][hr]I never contain phone or address related columns in something like the primary customer table. #1 An address or phone number isn't an attribute or measure of the customer; the telephone company assigns it to us for the duration of our contract, so we only have a relationship with it.#2 Multiple customers may share the same address or phone number, and for many applications that's useful to know.#3 Virtually nobody has only one phone number these days. #4 There are usually additional contextual information like type, priority (primary, seconday, etc.), and also effective_from_date and effective_to_date that users need to know. [/quote]Everything you mention is a Normalization that is done after 1NF.Customer Phone number relations are used in many texts to explain the differant levels of Normalization.I bet Tom was not trying to imply this as good DB structure.  [/quote]Right that would actually be 2nd normal form, I think. As for a common 1NF violation, and this is probably at the top of my gripe list, is containing "date/time" values in a varchar column, especially when it's inserted by multiple applications and there is no check constraint on the format or consensus about what format to use. For example, a request comes in to modify an ETL extract from a legacy database that's soon to be retired. Filter out patients less than 18 years old. OK, it's a Friday afternoon, and I'm hoping I can knock it out in an hour. However, I then look at the table and see this nonsense. Got to hate that.birthdate------------04/15/20073/3/2004jan 2, 2010[/quote]Weird... Have we worked for the same people? I had the same issue with an Enterprise ETL application once. After three failed deployments by the offshore Dev team my manager begged for help. Took me 20 minutes on a friday to re-do the whole thingredoSIS, debug the location of the only two dates MS SQL could not do an implicit conversion for, get those updated from the lame UI that allowed the entry, and got home early.  All three of these examples can be implicitly converted from string to date by SQL server.  Test it.[code="sql"]create table testdate(birthdate datetime);Insert Into testdateSelect '04/15/2007' birthdate unionSelect '3/3/2004' birthdate unionSelect 'jan 2, 2010' birthdate;select * from testdate;drop table testdate;[/code]I can't hate what MS SQL has made easier for me to work with.</description><pubDate>Thu, 30 Jun 2011 14:08:23 GMT</pubDate><dc:creator>SanDroid</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]SanDroid (6/30/2011)[/b][hr][quote][b]Eric M Russell (6/30/2011)[/b][hr]I never contain phone or address related columns in something like the primary customer table. #1 An address or phone number isn't an attribute or measure of the customer; the telephone company assigns it to us for the duration of our contract, so we only have a relationship with it.#2 Multiple customers may share the same address or phone number, and for many applications that's useful to know.#3 Virtually nobody has only one phone number these days. #4 There are usually additional contextual information like type, priority (primary, seconday, etc.), and also effective_from_date and effective_to_date that users need to know. [/quote]Everything you mention is a Normalization that is done after 1NF.Customer Phone number relations are used in many texts to explain the differant levels of Normalization.I bet Tom was not trying to imply this as good DB structure.  [/quote]Right that would actually be 2nd normal form, I think. As for a common 1NF violation, and this is probably at the top of my gripe list, is containing "date/time" values in a varchar column, especially when it's inserted by multiple applications and there is no check constraint on the format or consensus about what format to use. For example, a request comes in to modify an ETL extract from a legacy database that's soon to be retired. Filter out patients less than 18 years old. OK, it's a Friday afternoon, and I'm hoping I can knock it out in an hour. However, I then look at the table and see this nonsense. Got to hate that.birthdate------------04/15/20073/3/2004jan 2, 2010</description><pubDate>Thu, 30 Jun 2011 13:43:11 GMT</pubDate><dc:creator>Eric M Russell</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>I liked this article.  It gave an understandable definition of 1NF without having to be familiar with all the terms of domain theory or relation theory or whatever esoterica people often use to prove how smart they are.  I don't mind nulls for our use.  I work for an insurance company and when someone calls in a claim "I don't know" or "I'm not sure now" are legitimate responses.  Making a table for each possible "I don't know"-- "I don't know the name of the guy I hit", "I don't know my policy number" seems unwieldy.Thanks for the article and I look forward to 2NF!Jim</description><pubDate>Thu, 30 Jun 2011 13:33:28 GMT</pubDate><dc:creator>skt5000</dc:creator></item><item><title>RE: First Normal Form</title><link>http://www.sqlservercentral.com/Forums/Topic1134187-2681-1.aspx</link><description>[quote][b]Eric M Russell (6/30/2011)[/b][hr]I never contain phone or address related columns in something like the primary customer table. #1 An address or phone number isn't an attribute or measure of the customer; the telephone company assigns it to us for the duration of our contract, so we only have a relationship with it.#2 Multiple customers may share the same address or phone number, and for many applications that's useful to know.#3 Virtually nobody has only one phone number these days. #4 There are usually additional contextual information like type, priority (primary, seconday, etc.), and also effective_from_date and effective_to_date that users need to know. [/quote]Everything you mention is a Normalization that is done after 1NF.Customer Phone number relations are used in many texts to explain the differant levels of Normalization.I bet Tom was not trying to imply this as good DB structure.  </description><pubDate>Thu, 30 Jun 2011 13:21:30 GMT</pubDate><dc:creator>SanDroid</dc:creator></item></channel></rss>