Codd's Rules

  • Frank Kalis

    SSC Guru

    Points: 111183

    Comments posted to this topic are about the content posted at http://www.sqlservercentral.com/columnists/fkalis/coddsrules.asp

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Razvan Socol

    SSCarpal Tunnel

    Points: 4758

    Very nice article. We should all know these rules, but more importantly the database vendors should keep them in mind whenever they build something.

    Here is a paper that discusses in more detail the Codd's twelve rules and how "the big three" DBMS-s are following them:

    http://www.handels.gu.se/epc/archive/00002948/01/Nr18_MIB.pdf

    All my respects to the late Mr. Codd; here are some things about his life and his work:

    http://www.bayarea.com/mld/mercurynews/news/5676110.htm

    Razvan

  • Frank Kalis

    SSC Guru

    Points: 111183

    quote:


    Very nice article. We should all know these rules, but more importantly the database vendors should keep them in mind whenever they build something.


    Thanks!

    And yes, let's hope that the marketing and sales guys can be tamed!

    quote:


    Here is a paper that discusses in more detail the Codd's twelve rules and how "the big three" DBMS-s are following them:

    http://www.handels.gu.se/epc/archive/00002948/01/Nr18_MIB.pdf


    Many thanks for this link.

    Just what I needed for some other purposes.

    Frank

    http://www.insidesql.de

    http://www.familienzirkus.de

    Edited by - Frank kalis on 12/10/2003 02:36:19 AM

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • bdevone

    SSC Enthusiast

    Points: 167

    Good article. However, I as a programmer have been asked to break rule #1, "Data is atomic. That means the intersection of a column and a row can only contain one single value," many times. At times I have been asked to put into one field the following values:

    1. Parameter strings

    2. xml

    3. Delimited lists

    4. Integers that are interpreted via bit-wise operators for multiple boolean values.

    Are there any justifications for these practices?

    Any comments?

  • Frank Kalis

    SSC Guru

    Points: 111183

    quote:


    1. Parameter strings

    2. xml

    3. Delimited lists

    4. Integers that are interpreted via bit-wise operators for multiple boolean values.


    When you follow this forum or other for a while you'll notice that it seems to be common practice. There are frequently post on how to pass arrays...

    Of course, you can store them in a single column. However, they will not be

    interpreted like you want.

    As usual, I suggest reading this http://www.algonet.se/~sommar/dynamic_sql.html

    quote:


    Are there any justifications for these practices?


    Honestly...???

    The vast majority of cases I have followed here were solved without using these practices.

    Well, one might argue that there are economic needs in that your time to market should be very short. So you might implement a design that quickly yields results.

    If I have the choice, I'd always would spent more time on proper database design.

    Frank

    http://www.insidesql.de

    http://www.familienzirkus.de

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Adam Machanic

    SSCoach

    Points: 15259

    quote:


    Are there any justifications for these practices?


    Yes; laziness and lack of understanding of the relational model. Not that I'd suggest you tell this your boss or whoever is telling you to do this, especially in the current job market... But try to push them in the right direction and keep it in mind for the future when you're architecting databases and telling others what to put in them. And always think about the "what if" scenarios! Remember that in a year, "we won't ever need to query this data" will invariably become, "we need a report based on the fifth member of that delimited list, for all of our 10,000,000 customers, and it has to run in under a second."

    Check out this site... lots of great content on this and related subjects:

    http://www.dbdebunk.com

    --
    Adam Machanic
    whoisactive

  • Jonathan

    SSC-Insane

    Points: 20427

    Good article, Frank.

    But I think you're mistakenly mixing data atomicity into Codd's Information Rule. I didn't check your sources, but decomposition is a function of schema design (first normal form) and not something which can or should be enforced by a DBMS.

    That's not to say that a DBMS can be designed to prevent an unlearned schema designer from breaking the Information Rule. I've seen several designs on these forums that unfortunately do just that by having table names convey data, e.g. otherwise identical tables named CashInvoice and CreditInvoice. This, IMHO, breaks Codd's Information Rule by storing information in the identifier rather than as values in columns, and thereby leads to kludges like UNIONs and dynamic T-SQL.

    An RDBMS can be designed to support one of Codd's rules, but it doesn't necessarily prevent one from ignorantly breaking that rule; e.g. one can also create a table without a primary key in SQL Server, so the database is therefore not relational, but that cannot be blamed on SQL Server.

    --Jonathan



    --Jonathan

  • ckempste

    SSCoach

    Points: 17983

    Hi there

    Ok article, not sure what I got from it though 🙂 Codd and co had his day, and the pioneer work done in and around this era was revolutionary, but dont see how this improves/build/differentiates me into the future, or perhaps even, provides an interest outside of Uni..

    Just thinking aloud.

    Cheers

    Ck

    Chris Kempster

    http://www.chriskempster.com

    Author of "SQL Server 2k for the Oracle DBA"


    Chris Kempster
    www.chriskempster.com
    Author of "SQL Server Backup, Recovery & Troubleshooting"
    Author of "SQL Server 2k for the Oracle DBA"

  • Frank Kalis

    SSC Guru

    Points: 111183

    quote:


    But I think you're mistakenly mixing data atomicity into Codd's Information Rule. I didn't check your sources, but decomposition is a function of schema design (first normal form) and not something which can or should be enforced by a DBMS.


    Yes, you're right.

    It is not mentioned in Rule #1. Looking to fix the article.

    quote:


    An RDBMS can be designed to support one of Codd's rules, but it doesn't necessarily prevent one from ignorantly breaking that rule; e.g. one can also create a table without a primary key in SQL Server, so the database is therefore not relational, but that cannot be blamed on SQL Server.


    How do you say?

    'Only a poor worksman blames his tool'?

    Frank

    http://www.insidesql.de

    http://www.familienzirkus.de

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Frank Kalis

    SSC Guru

    Points: 111183

    quote:


    Ok article, not sure what I got from it though 🙂 Codd and co had his day, and the pioneer work done in and around this era was revolutionary, but dont see how this improves/build/differentiates me into the future, or perhaps even, provides an interest outside of Uni..


    Well, okay this might not be an article for a seasoned db developer or dba like you, but there are also not so experienced people.

    Surely I get flamed for this, but I can understand why Celko rants on the MS Newsgroups. And in most cases he is right on what he says.

    Note, I didn't say I like the way he says his things (although I find them very entertaining).

    Personally I blame it on that 'Visual..' thing that makes it easy to produce results in a very short time. Even if you have very less experience. And I think this is what happens quite often nowadays. There are many people around who develop suboptimal databases and/or software. Why?

    One reason IMHO is that the don't know and don't care about the fundamental principles of this 'art'.

    So there should be interest in that theory even outside the mensa crowd.

    Don't take me wrong, Chris, but I think it doesn't hurt anyone to know a little bit more than one needs to accomplish his tasks.

    I for myself like looking over the rim of my coffee cup.

    Frank

    http://www.insidesql.de

    http://www.familienzirkus.de

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • ckempste

    SSCoach

    Points: 17983

    Hi Frank

    I hear you loud and clear 🙂

    Thanks for the article btw, I actually did enjoy it along with your others... I have a real admiration of Codd et al and true, there is a fair number of rules missed by designers and dba's alike that have been long since proven but often ignored.

    Best Regards

    Ck

    Chris Kempster

    http://www.chriskempster.com

    Author of "SQL Server 2k for the Oracle DBA"


    Chris Kempster
    www.chriskempster.com
    Author of "SQL Server Backup, Recovery & Troubleshooting"
    Author of "SQL Server 2k for the Oracle DBA"

  • Frank Kalis

    SSC Guru

    Points: 111183

    Hi Chris,

    glad to see, you didn't get me wrong !!!!

    Let me give an realworld example.

    At the beginning of last year my company decided we need a system to manage our mutual funds, because our unit-linked products grow both in size and importance.

    Now, because I live in both worlds, I said ok, I will do this.

    Our management was concerned what might happen when I leave the company, who will be there to care for the system, who will know what the system does....

    So they decided to give this to a third party software company. The requirements were SQL Server as DBMS and VB6 as programming language (I wouldn't have done anything else, btw).

    Then, the programmer arrived and stayed two weeks with us to get the idea what we want and what functionality this system needs to have.

    Now, this programmer was

    - not familiar with SQL Server (first project with this DBMS)

    - not familiar with VB (first project with VB)

    - had no clue about overall treatment of assets like book-keeping and alike

    I had to do that much overtime to explain our requirements to him, that I would have been far better off doing this by myself.

    Now one and a half year later, the system is in production and it is close to collapse. There is that much garbage in that db, the general responding times are unacceptable and overall performance is poor.

    Management realized this and now guess whom they ask to look at bugs???

    After all, this funny exercise was at a cost of some 60k Euros.

    Frank

    http://www.insidesql.de

    http://www.familienzirkus.de

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Clay_G

    Default port

    Points: 1454

    It is good to be reminded just what RELATIONAL means.

    XML is an example. Your piece of XML will itself have columns and rows, so it is logically a table (relation). By wrapping it up as a string and storing it in a column, you are logically nesting tables.

    You database engine may allow you to step outside the bounds of relational principles this way, but the price you pay is that nicely interchangable tools that have developed due to the consistency of the relational model, like report writers for example, will not work with such arbitrary models. Not that XML itself cannot be recognised as relational data, its where its usage breaks the rules, such as nesting tables.

    XML in particular is better thought of as a TRANSPORT for relational data, and as such you want translators to/from XML and native tabular protocols. I think SQLServer got it wrong. Instead of SELECT..FOR XML, there should have been an alternative type of CONNECTION that used XML as the transport, and every SQL statement should be independent of which type of transport it will use. In ADO, XML would be a subclass of Recordset. I hate to think that all the stored procedures that existed on one database I worked on, would all need to be cloned for an XML version.

  • Frank Kalis

    SSC Guru

    Points: 111183

    I totally agree. It's good to remember and even better to learn what RELATIONAL means.

    Just a while ago we had some interesting discussions on pros and cons of XML.

    I'm not going to argue here on XML.

    Frank

    http://www.insidesql.de

    http://www.familienzirkus.de

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • TomThomson

    SSC Guru

    Points: 104773

    Frank Kalis, article


    Updateable views are not always possible. For example there is a problem when a view addresses only that part of a table that includes no candidate key. This could mean that updates could cause entity integrity violations. Some sources on the internet state that 'Codd himself did not fully understand this'. I haven't found any rationale for this.

    I know it's more than 7 years on, but the article raises a particular question/issue that has not been resolved in the comments so far so thios comment, although very late, may still be useful.

    I think that the thing that people (including Codd himself) didn't understand (in 1985: he did understand it later) in respect of updateable views wasn't that not all views were updateable (he certainly understood that, else he would not have talked about theoretically updateable views) but that there is no effectively computable algorithm which is guaranteed, given a view definition and the definitions of the underlying tables (assumed to be in 5NF, using NF definitions which allow non-key columns to be nullable - as did all the original definitions), to terminate after finite time and indicate whether the view is theoretically updateable or not. He mentions this in the RM 2 book you reference (the solution was to have some algorithms that were pessimistic - sometimes they would say something was not updateable when it was, but they were guaranteed always to terminate) and chage the updateable view requirement to say only that views updateable according to whatever algorithm the system used would be updateable instead of saying that all theoretically updateable views would be updateable - the trade-off here was that this slight loss of generality for view updateableness made the rule possible to implement (or another way of looking at it was that he introduced a meta-rule that all the rules had to be effectively computable, and this forced a change in the view update rule).

    The lack of an effectively computable algorithm to decide view updateability was not discovered until after the two 1985 articles had been published, so Codd's failure to recognise that issue is not in any way remarkable or surprising, nor does it indicate any kind of careless or failure of competence on Codd's part. He did of course recognise the issue with his updatability rule as soon as this lack was discovered and he was informed of it.

    Tom

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply