Third Normal Form

  • Comments posted to this topic are about the item Third Normal Form

    Tom

  • Thanks for these articles Tom. I have ensure the application developers here are reading all of them. Of course the DBA's already get it but like a lot of small shops the DBAs don't do development and although there are a lot of senior developers they are all building to a OLTP style and are always making these common mistakes.

    I'm anxiously looking forward to the next in the series and the Fourth and Fifth are for me as I've only ever gotten to 3.5 or what I was taught as Boyce Codd Normal Form.

  • Great and clear example. Thanks Tom.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Tom,

    Thanks for the series, it has been very interesting thus far.

    However, I think more distinction needs to be made between 2NF and 3NF. The example seems to be about 2NF. The Candidate key for the original table is City. The Phone number is not dependent upon City at all, it is dependent upon Group. I think that you are trying to show a transitive relationship from the Key to the phone number. But that relationship doesn't seem to exist in the example.

    In the wikipedia example of 3NF http://en.wikipedia.org/wiki/Third_normal_form this transitivity is expressed by having two overlapping candidate keys, one of which determines the non-prime attribute, but the other doesn't. This seems to imply that the difference between 2NF and 3NF is that in 2NF non-prime attributes need to depend on A candidate key, while in 3NF the non-prime attributes need to depend on ALL candidate keys.

    So, can you please expand on this and clear up my confusion?

    Thanks,

    JimFive

  • Due to the extreme flexibility requirements of our reporting system users need at least a nodding acquaintence with the database schema--primarily how specific data can be found in specific tables, and how the tables are related to each other.

    Rather than try and explain anything about normal forms we use the SPOT and Path metaphor. SPOT = Single Point Of Truth (ie the data is only stored in one single place *once*) and there are always paths between tables, so for example a Zone (set of contiguous zip codes) is always connected to only one Office which in turn is in one Region.

    Of course the data has to be 3NF to support this, but I always thought SPOT and Path was a much more natural way to visualize 3NF anyway.

  • When I was studying Relational Database Concepts in college, there was a catchy mnemonic for 3NF: "The key, the whole key, and nothing but the key, so help me Codd."

  • NB: 40 Tree Rd became 30 Tree Rd at some point.

  • James Goodwin (7/28/2011)


    Tom,

    Thanks for the series, it has been very interesting thus far.

    However, I think more distinction needs to be made between 2NF and 3NF. The example seems to be about 2NF. The Candidate key for the original table is City. The Phone number is not dependent upon City at all, it is dependent upon Group. I think that you are trying to show a transitive relationship from the Key to the phone number. But that relationship doesn't seem to exist in the example.

    The Wikipedia article also refers to Carlo Zaniolo's new definition of 3NF, which is provably equivalent to the original definition, and that was the definition I was using - it's much simpler, easier to explain, and easier to see the relationship between 3NF, EKNF, and BCNF when you use Zaniolo's definition. If you prefer to think in terms of the original definition, there is a transitive dependency there: City -> Group (because of the stated business rules that only one group has a depot in any given city - but anyway as (City) is a candidate key we know that City -> Group is an FD of this relation) and Group -> Phone number (because of the stated business rule that each group uses a single order phone number); and the partition of the table represents the Group -> Phone number FD in a separate table, removing Phone number from the original table to eliminate the transitive FD.

    In the wikipedia example of 3NF http://en.wikipedia.org/wiki/Third_normal_form this transitivity is expressed by having two overlapping candidate keys, one of which determines the non-prime attribute, but the other doesn't. This seems to imply that the difference between 2NF and 3NF is that in 2NF non-prime attributes need to depend on A candidate key, while in 3NF the non-prime attributes need to depend on ALL candidate keys.

    Even in 1NF every attribute, prime or not, is dependent on every superkey (that's the definition of a superkey) and hence on every candidate key (as a candidate key is a minimal superkey). The difference between 2NF and 3NF is that 2NF does not permit any non-prime attribute to have a dependency on something which is a proper subset of a candidate key, while in 3NF a non-prime attribute may not depend on anything which is not a superkey (2NF says "the whole key" while 3NF says "nothing but the key").

    So, can you please expand on this and clear up my confusion?

    Thanks,

    JimFive

    I hope the above is clear enough - if not, get back to me and I'll try again.

    Regards

    Tom

    Tom

  • roger.plowman (7/28/2011)


    Due to the extreme flexibility requirements of our reporting system users need at least a nodding acquaintence with the database schema--primarily how specific data can be found in specific tables, and how the tables are related to each other.

    Rather than try and explain anything about normal forms we use the SPOT and Path metaphor. SPOT = Single Point Of Truth (ie the data is only stored in one single place *once*) and there are always paths between tables, so for example a Zone (set of contiguous zip codes) is always connected to only one Office which in turn is in one Region.

    Of course the data has to be 3NF to support this, but I always thought SPOT and Path was a much more natural way to visualize 3NF anyway.

    The trouble with the Spot and Path approach is that there are some collections of business rules for which it doesn't work, because some data has to be stored in more than one place (but a combination of foreign key constraints, null constraints, and unique constraints ensure that any attempt to update one copy without updating the other fails). There's an example of this in Wikipedia, on the BCNF page, in the section "Achievability of BCNF". What's going on is that some data is involved in several business rules (which should become database constraints) which can't all be represented in one table.

    Where Spot and Path is applicable (which is very often) it seems to be a reasonable approach.

    Tom

  • Michael.Beeby (7/28/2011)


    NB: 40 Tree Rd became 30 Tree Rd at some point.

    Ouch! I will try to get that fixed. Thanks for pointing it out.

    edit (later): Steve has corrected it for me now - he really is good at fixing my mistakes. :blush::-)

    Tom

  • olsonea (7/28/2011)


    When I was studying Relational Database Concepts in college, there was a catchy mnemonic for 3NF: "The key, the whole key, and nothing but the key, so help me Codd."

    It's a good catch phrase. Been around a long long time.

    Did they tell you that:

    1NF = The key

    2NF = The whole key

    3NF = Nothing but the key

    ?

    Tom

  • The difference between 2NF and 3NF is that 2NF does not permit any non-prime attribute to have a dependency on something which is a proper subset of a candidate key, while in 3NF a non-prime attribute may not depend on anything which is not a superkey

    My confusion is that the original table in the example is, then, not in 2NF because the phone number depends on something (Group) that is not the whole key, or in fact, even part of the key. So it seems that you took a table from 1NF to 2NF which was then trivially in 3NF (in the sense that no extra work was required to get it into 3NF).

    What I would like to see is a table that is in 2NF converted into 3NF.

    --

    JimFive

  • James Goodwin (7/28/2011)


    The difference between 2NF and 3NF is that 2NF does not permit any non-prime attribute to have a dependency on something which is a proper subset of a candidate key, while in 3NF a non-prime attribute may not depend on anything which is not a superkey

    My confusion is that the original table in the example is, then, not in 2NF because the phone number depends on something (Group) that is not the whole key, or in fact, even part of the key. So it seems that you took a table from 1NF to 2NF which was then trivially in 3NF (in the sense that no extra work was required to get it into 3NF).

    What I would like to see is a table that is in 2NF converted into 3NF.

    --

    JimFive

    Ah, a confusion about 2NF. You hit the crucial point when you said "or [not] in fact, even part of [the key]", because that is of course allowed by 2NF. There's nothing in 2NF to stop an attribute being dependent on some set of attributes that includes something that isn't part of a key. The rule is that if a non-prime attribute is dependent on something that is a subset of a key that subset must be the whole key. Or putting it another way, the whole key is not the only thing it can be dependent on, the rule is simply that it must never be dependent on a partial key. So the original table inthe 3NF article is indeed in 2NF.

    Tom

  • Thanks for the articles Tom. I am really enjoying them; your examples are nice and clear. I am looking forward to the rest!

    Cheers,

    Nicole Bowman

    Nothing is forever.

  • Thanks Tom. I've enjoyed all 3 of these articles. I also like your examples.

    ---------------------------------------------------------------------
    Use Full Links:
    KB Article from Microsoft on how to ask a question on a Forum

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply