"Real" copy of Codd's 12 (13) Rules for RDBMS

  • I've been cruising the web on the subject of Codd's 12 rules for a long time now. I've read what amounts to about 50 articles and hundreds of posts on the subject. If you compare the rules from all those, you find large disparities in what is said so I don't trust any of them to be correct. Instead of sifting through all the manure to find out what the horse was thinking, I just want to ask the horse. ๐Ÿ˜‰ Unfortunately, Mr. Codd is no longer in the land of the living (may his soul rest in peace).

    The only link that I've found that may have had an actual copy of Mr. Codd's original white paper, which contained the 12 rules, is broken.

    With all of that in mind, does anyone have a link or a copy of the actual white paper or, perhaps a book by Codd where he mentions the rules? I'd really like to "hear" the rules from the horse's mouth because, from what I can see, too many people have mis-copied or misinterpreted the rules.

    Thanks for the help, folks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • It is included in the disc version of the ACM SIGMOD anthology, although I don't think the text is available online.

    http://dblp.uni-trier.de/db/conf/icde/Codd86.html

    That's the rebublished version of the Computerworld article.

    The 13 "rules" are still useful discussion points for students I suppose. I'm not sure how important it is to be precise about the original text however. They are definitely not a formal definition of the relational model. Rather they are some potentially interesting observations about some properties of DBMSs. The article is also strong in its criticisms of SQL and the SQL standard's failure to support the relational model. Yet the rules are frequently quoted in a SQL context while Codd's remarks opposing SQL are never mentioned!

    The science and understanding of the relational model has developed since 1985 and there are other texts that I think cover the same material better. Read Codd by all means, but also read David Maier, Date, Darwen and the Alice Book (Abiteboul).

  • I asked a friend, who is a CS Ph.D student, if he could dig it up. Unfortunately, no luck. His reply:

    It's referenced in the ACM digital library:

    http://portal.acm.org/citation.cfm?id=655234

    But being a paper from 1986 there is no pdf of it there (not even a scan of a hard copy as is the go with older papers).

    Oddly, there is no reference to it at all in the ieee digital library.

    As a paper from 1986, I'm not sure of the chances of it being in any library is. It certainly isn't in the [Swinburne University] one...

    So I'm all outta ideas.

  • David Portas (7/12/2010)


    It is included in the disc version of the ACM SIGMOD anthology, although I don't think the text is available online.

    http://dblp.uni-trier.de/db/conf/icde/Codd86.html

    That's the rebublished version of the Computerworld article.

    The 13 "rules" are still useful discussion points for students I suppose. I'm not sure how important it is to be precise about the original text however. They are definitely not a formal definition of the relational model. Rather they are some potentially interesting observations about some properties of DBMSs. The article is also strong in its criticisms of SQL and the SQL standard's failure to support the relational model. Yet the rules are frequently quoted in a SQL context while Codd's remarks opposing SQL are never mentioned!

    The science and understanding of the relational model has developed since 1985 and there are other texts that I think cover the same material better. Read Codd by all means, but also read David Maier, Date, Darwen and the Alice Book (Abiteboul).

    Heh... thanks for the response, David, but it's actually not up to you as to whether I think or will think the 13 rules are "a formal definition" definition or not especially when so many people cite them to explain why SQL Server (SQL in general, really) is not truly an RDBMS. And, yes, I believe it to be very important to explore the original text just as it might be if one were exploring the "Agile Manifesto" and for the same reasons... too many people have put their own slant on things and I'd love to have the opportunity to judge for myself. The only way I can do that is to study the original wording of the original (or true facsimile of) document. And, yeah... I'd like to study the whole document and not just the 13 rules.

    I appreciate your suggested reading list but those folks aren't Codd. As I said, a lot of folks use Codds rules and his supposed quotes and I'd like to see just how accurate those people are as well as having the opportunity to make up my own mind about what Codd actually stated.

    {edit} Crud... the site you referenced wants to download to a CD and I won't have access to a computer that has a burnable CD or DVD for at least another week. I hope someone else can come up with a facsimile of the original document but, if they don't, I'll do the download next week. Thanks for the leg up, David.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jim McLeod (7/12/2010)


    I asked a friend, who is a CS Ph.D student, if he could dig it up. Unfortunately, no luck. His reply:

    It's referenced in the ACM digital library:

    http://portal.acm.org/citation.cfm?id=655234

    But being a paper from 1986 there is no pdf of it there (not even a scan of a hard copy as is the go with older papers).

    Oddly, there is no reference to it at all in the ieee digital library.

    As a paper from 1986, I'm not sure of the chances of it being in any library is. It certainly isn't in the [Swinburne University] one...

    So I'm all outta ideas.

    Jim,

    Thank you and your friend both. I appreciate the (literally) good ol' college try. ๐Ÿ™‚

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (7/12/2010)


    Crud... the site you referenced wants to download to a CD and I won't have access to a computer that has a burnable CD or DVD for at least another week. I hope someone else can come up with a facsimile of the original document but, if they don't, I'll do the download next week. Thanks for the leg up, David.

    No, it's actually worse than that: the CD (or DVD) is a local file reference - in effect it wants you to insert the CD in your drive so that it can read the data from there! You may still be able to buy a copy of those conference procedings from the IEEE, but it may no longer be available and anyway if it is available it will probably cost an arm and a leg; I'm pretty sure that several university libraries in the US will have a copy, but more won't and some are not at all helpful about access anyway. Besides, as David pointed out, the ICDE paper was not the original version.

    However, all is not lost: there is an accurate (as far as I can remember :crying: - must be getting senile) version on the web, at http://www.cse.ohio-state.edu/~sgomori/570/coddsrules.html. (I have no idea why Gomori teaches it in his File Structure and Design course instead of in his Intro to Database Systems! I don't know the guy, so I haven't asked him.)

    (edit) But watch out for rule 6: there are some problems, which Codd addressed later.

    Tom

  • Thanks, Tom. Yep... this is one of the many versions of seen. Obviously I was hoping for the real thing but if someone like you, Barry Young, or any of the other folks that can fluently speak "Codd" endorses a particular version as the "real deal", then I'll have to go with that. Thanks, Tom

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Almost forgot... thanks for the warning on Rule 6, Tom. I haven't seen anyone address the "problem" you're talking about but I suspect that the word "View" in that case is not an object in the database... correct?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (7/13/2010)


    Almost forgot... thanks for the warning on Rule 6, Tom. I haven't seen anyone address the "problem" you're talking about but I suspect that the word "View" in that case is not an object in the database... correct?

    No, a view is in the database; but there is a result for relational algebra and first order logic handling view definition and update that's a bit like the "halting problem" for Turing Machines or Gรถdel's "Incompleteness Theorem" for the predicate calculus. http://www.informatik.uni-trier.de/~ley/db/journals/sigmod/Buff88.html and http://www.informatik.uni-trier.de/~ley/db/journals/sigmod/Codd89.html give references to some documents on this. As I understand it the original rule 6 is a bit like saying "your programming language has to be capable of expressing all Turing Machines that halt and no others", but I'm very rusty on this stuff (the last time I worked on the semantics of logical calculi was early 1968) so that may not be exactly what the problem is.

    Codd revised his rules (and revised rule 6 extensively) and published the new versions in his book "The relational model for database management: version 2" in 1990; this can be bought new (233.45 dollars) or second hand (40 USD for good quality) via Amazon, or you can create an ACM Web Account and buy a PDF download of it for $15. Chapter 17 View Updatability is about 30 pages (the whole book is 567 pages). There's a rules index as well as other indices at the end. The book's probably a better source for Codd's rules than anything else currently available - it's not the original version, but it is Codd's own words, just 5 years later than the original version.

    Tom

  • Here is rule 1 as it appears in Codd's 1990 version:-

    RS-I The Information Feature

    See Rule 1 in the 1985 set. The DBMS requires that all database

    information seen by application programmers (AP) and interactive

    users at terminals (TU) is cast explicitly in terms of values in

    relations, and in no other way in the base relations. Exactly one

    additional way is permitted in derived relations, namely, ordering

    by values within the relation (sometimes referred to as inessential

    ordering).

    ------------------------------------

    This means, for example, that, in the database, users see no repeating

    groups, no pointers, no record identifiers (other than the declared primary

    keys), no essential (i.e., non-redundant) ordering, and no access paths.

    Obviously this is not a complete list. Such objects may, however, be supported

    for performance reasons under the covers because they are then not visible

    to users, and hence impose no productivity-reducing burden on them.

    -------------------------------------

    Codd follows this with a long explanation of why repeating groups are a bad thing.

    I'll post the other rules from the 1990 book as and when I find time.

    Tom

  • Codd's 1985 rule 6 (view updatability) is, according to his rules index at the back of the 1990 book, replaced by RV-4 and RV-5; but according to the main text RV-6 is the modified rule 6. So all three are given below.

    They refer to Algorithm VU-1 which is an algorithm which tries to determine from a view definition whether the view is updatable but can deliver a "don't know" answer as well as "yes" and "no".

    --------------------------------------------------------------------------------

    RV-4 Retrieval Using Views

    Neither the DBMS nor its principal relational language, RL, makes any user-visible distinctions between base R-tables and views with respect to retrieval operations. Moreover, any query can be used to define a view by simply prefixing the query with a phrase such as CREATE VIEW.

    ---------------------------------------------------------------------------------

    RV-5 Manipulation Using Views

    Neither the DBMS nor its principal relational language, RL, makes any user-visible manipulative distinctions between base R-tables and views, except that (1) some views cannot accept row insertions, and/ or row deletions, and/or updates acting on certain columns (Algorithm VU-1 or some stronger algorithm fails to support such action), and (2) some views do not have primary keys and therefore will not accept those manipulative operators that require primary keys to exist in their operands.

    -----------------------------------------------------------------------------------

    RV-6 View Updating

    To evaluate the updatability of views at view-definition time, the DBMS includes an implementation of Algorithm VU-1 or some stronger algorithm. Neither the DBMS nor its principal relational language, RL, makes any user-visible manipulative distinctions between base relations and views, except that:

    some views cannot accept row insertions, and/or row deletions, and/or updates acting on certain columns because Algorithm VU-1 or some stronger algorithm fails to support such action;

    and

    some views do not have primary keys (they have weak identifiers only) and therefore will not accept those manipulative operators that require primary keys to exist in their operands.

    (This feature is a slightly modified version of Rule 6 in the 1985

    set.)

    -----------------------------------------------------------------------------------

    I think that in his RV-6 Codd understated the extent of the modification from the old rule 6. Further on in the book he states

    Unfortunately, the general problem of determining whether or not a view is theoretically updatable cannot be decided logically [Buff 1986].

    and that's why he had to change the rule.

    Tom

  • Awesome feedback. As a side bar... Heh... no appologies for wanting to for wanting to form an opinion on some things on my own (which I still have to do) but it does sound like he's correct about the original rules not being so important anymore.

    Thank you both.

    I'd heard the book that Tom recommended wasn't so hot but it's one of the few things I can actually find that was written by Codd instead of being paraphrased. I'll see if I can dig up one of those cheap copies Tom was talking about.

    Thanks again, guys.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (7/13/2010)


    I'd heard the book that Tom recommended wasn't so hot but it's one of the few things I can actually find that was written by Codd instead of being paraphrased. I'll see if I can dig up one of those cheap copies Tom was talking about.

    If you get it as PDF from ACM using the beta version of the portal you may be given a non-working download URL: if that happens don't panic, just try eliminating the folder names after the website root (the FT cfm is in the root) - but they've probably fixed that bug now. Or if you don't want to risk it, don't follow the link to the Beta version. It's a text PDF (either generated using OCR or converted from a postscript version, I guess) not a scanned image file so text-searchable, unlike many (undesirable) book PDFs.

    You are right, it isn't a hot book - it's too long, nothing that long can be hot. It's many times as long as the original paper describing RM and many times as long as the RM-T paper. Codd's justification for the length was that he wanted to put together in one place all the contents of 50+ articles which he had written on the topic so that they would be available to the people working on RDBMS (well, we know how hard it is to get for example the original two "rules" papers, or the sigmod presentation of them the following year, so we know he had a point there). Despite its length it's good - but it would have been better if it had been published a few years earlier, when it would have had some chance of influencing the mainstream development of RDBMS and SQL, instead of in 1990 when all the big players already had massive investment in doing it wrong. Something like this book - in shorter form - might have changed the course of RDBMS development if it, rather than the two rules papers, had been published back in 1985 instead of in 1990.

    Tom

  • Amazing documentation, Thank you a lot Tom.

    I'm saving these jewels on my private library for future reference ๐Ÿ™‚

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • Another rule (number 2) in its 1990 version. This one hadn't changed much from the 1985 version.

    RM-1 Guaranteed Access

    Each and every datum (atomic value) stored in a relational database is guaranteed to be logically accessible by resorting to a combination of R-table name, primary-key value, and column name. (This feature is Rule 2 in the 1985 set.)

    _______________________________________________________

    He follows this definition of the rule with some explanatory text:-

    The access path supporting this feature cannot be canceled. Most other access paths, however, are purely performance-oriented, and can be both introduced and canceled. Both Feature RM-1 and Feature RM-3 are needed in order to support ad hoc query without pre-defined access paths.

    Clearly, each datum in a relational database can be accessed in a rich variety (possibly thousands) of logically distinct ways. It is important, however, to have at least one means of access, independent of the specific relational database, that is guaranteed~because most computer-oriented concepts (such as scanning successive addresses) have been deliberately omitted from the relational model.

    Note that the guaranteed-access feature represents an associative-addressing scheme that is unique to the relational model. It does not depend at all on the usual computer-oriented addressing. Moreover, like the original relational model, it does not require any associative-addressing hardware, even though the need for such hardware was once frequently claimed by opponents of the relational model.

    The primary-key concept, however, is an essential part of Feature RM-1. Feature RS-8 requires each base relation to have a declared primary key (see Chapter 2). Feature RM-1 is one more reason why the primary key of each base relation should be supported by every relational DBMS, and why its declaration by the DBA should be mandatory for every base relation.

    _____________________________________________________

    (Feature RM-3 is support of the full power of a 4-valued predicate calculus; obviously you can't usefully do ad hoc queries without some sort of logic support, so the reference here isn't anything odd.)

    It's interesting that Codd says that a need for associative-addressing hardware was claimed to exist by oponents of the relational model. Associative addressing disc hardware was originally proposed in the 1960s, several years before Codd proposed the relational model, and the justification for it was nothing to do with the relational model: it was that no-one had any CPUs (or reasonably priced arrays of CPUs) that could handle search at the data rates that could be provided by "modern" (1960-s vintage) discs. The first customer deliveries of Content Addressable FileStore (CAFs) occurred before any relational database was delivered, and were made to BT and several other ICL customers in the 1970s; the software transformed queries into instructions for search logic incorporated into the drive mechanism. This early version was built as a series of one-offs for each customer that wanted one, not as a standard product - the standard product came later when controller technology had improve suffieiently to permit the search logic (which was a bit advanced for its time - for example it included weighted quorum logic) to be moved from the drive mechanism to the disc controller. CAFS was included in all ICL Series 39 mainframes and most 2900 series mainframes from 1982 onwards. At this point it was used mainly for IDMS (which is certainly not not relational) and for IDMS-X, but later it was incorporated into the ICL port of Ingres to the VME operating system and it was also used and for searching databases (again not relational) of unstructured text (eg at Oxford and Southampton Universities) . Another version (SCAFS for "Son of CAFS": ICL in those days had no qualms about product naming standards) was based on firmware in a standard microprocessor. As well as IDMS-X, all of Informix, Oracle and Unix provided an interface for the incorporation of/interfacing to the CAFs support software (a standard library provided by ICL with the hardware) and IBM licensed the SCAFS technology from ICL for use with DB2; but I don't recall anyone claiming it was essential for relational database support. Certainly no-one in ICL made or believed any such claims - CAFS and SCAFS were about getting past the then existing hardware limitations on search capability (effectively limitations on scan,filter, and join speeds) and nothing to do with what the database model was - and as processor speeds improved and the cost of multi-processor systems came down associative-addressing disc hardware would cease to be commercially viable (if the performance estimates produced by Leung and Choo http://www.vldb.org/conf/1985/P282.PDF were correct, a relative improvement factor of 2.5 between processor and disc bang per buck would do that; I don't think any modern systems use it - it would be fairly pointless if the CAFS hardware used the same industry standard components as the server). As far as I know no-one but ICL (and IBM under license from ICL) ever produced any associatively addressed disc systems as a large-volume product. I guess that IBM (specifically the DB2 team) must be where the claims that associative addressing hardware was neccessary for support of the relational model came from - after all, they were the opponents of RM best known to Codd.

    Tom

Viewing 15 posts - 1 through 15 (of 106 total)

You must be logged in to reply to this topic. Login to reply