SQLServerCentral Article

Codd's Rules

,

These rules were formulated by E.F.Codd and published in 1985 1). They describe what a

relational database system must support in order to call itself relational.

So, without further introduction, let's dive into the gospel of relational databases!

1. Information Rule

Data is presented only in one way. As values in columns in rows. Simple, consistent and versatile.

A table (aka an entity or relation) is a logical grouping of related data in columns and rows.

Each row (aka record or tuple) represents a single fact and each row contains information about just one

fact. Each column (aka field or attribute) describes a single property of an object. Each value (datum) is

defined by the intersection of column and row.

2. Guaranteed Access Rule

Each value can be accessed by specifying table name, primary key and column name. This way each and

every value is uniquely identifiable and accessible.

Basically this is a restatement of the fundamental requirement for primary keys.

3. Systematic treatment of Null values

A fully relational database system must offer a systematic way to handle missing information.

Null is always treated as unknown. Null does mean no value or the absence of a value. Because no value

was entered, it follows that it is unknown. The information is missing. Null is not the same as an empty

string or 0. Each value, Null included, compared with Null, is Null.

4. Dynamic relational online catalogue

In addition to user defined data a relational database contains also data about itself. So there are

two kinds of tables.

  • user-defined
  • system-defined

Metadata is data which describe the structure of the database, its objects and how they are related.

This catalogue is integrant part of the database and can be queried by authorized users just like any other

table. Another name for this online catalogue is system catalogue or data dictionary.

5. Comprehensive Data Sublanguage Rule

Codd's intention was to have at least one language to communicate with the database. This language should

be capable to handle data definition, data manipulation, authorization, integrity constraints and transactional

processing. It can be used both interactively and embedded within applications.

Although SQL is not the only data query language, it is by far the most common one. SQL is a linear,

non-procedural or declarative language. It allows the user to state what he wants from the database,

without explicitly stating where to find the data or how to retrieve the data.

6. View Updating Rule

When presenting data to the user, a relational database should not be limited

to tables. Views are 'virtual tables' or abstractions of the source tables. They react like tables with

the one exception that they are dynamically created when the query is executed. Defining a view does

not duplicate data. They are current at runtime.

All theoretically updateable views should be updateable by the system. If data is changed in a view, it

should also be changed in the underlying table. Updateable views are not always possible. For example there is a

problem when a view addresses only that part of a table that includes no candidate key. This could mean that

updates could cause entity integrity violations. Some sources on the internet state that 'Codd himself did not

fully understand this'. I haven't found any rationale for this.

7. High-level Insert, Update and Delete

A relational database system must support basic relational algebraic operations (Selection, Projection and

Joins) as well as set operations like Union, Intersection, Division and Difference. Rows are treated like

sets for data manipulation. Set operations and relational algebra are used to create new relations by

operations on other tables.

8. Physical Data Independence

The physical layer of the architecture is mapped onto the logical. Users and applications do not depend

upon the physical structure of a database. Implementation of the physical layer is job of the storage engine

of a RDBMS. The relational engine communicates with the relational without any interaction by the user.

An application that queries data from a relational database does not need to know, how this data is physically

stored. It only sends the data request and leaves the rest to the RDBMS. Applications should not be

logically impaired when the physical storage or access methods change.

9. Logical Data Independence

Users and applications are to a certain degree independent of the logical structure of a database. The

logical structure can be modified without redeveloping the database and/or the application. The relations

between tables can change without affecting the functionality of applications or ad-hoc queries.

10. Integrity Independence

To be viewed as a relational database the RDBMS must implement data integrity as an internal part of the

database. This is not the job of the application. Data integrity enforces consistence and correctness of the

data in the database. Simply put, it keeps the garbage out of the database. Changes to integrity constraints

should not have an affect on applications. This simplifies applications, but is not always possible.

11. Distribution Independence

The data manipulation (sub)language of the RDBMS should be able to work with distributed databases.

Views should be able to join data from tables on different servers (distributed queries) as well as from

different RDBMS (heterogeneous queries). The user should not have to be aware of whether a database is

distributed or not.

12. Nonsubversion Rule

If the RDBMS supports a low-level (single record at a time) language, this low-level language should not

be used to bypass and/or subvert data integrity that are expressed in the high-level (multiple records at a

time) relational language.

0. Foundation Rule

Interestingly Codd defined a Rule 0 for relational database systems.

"For any system that is advertised as, or claimed to be, a relational database management system,

that system must be able to manage databases entirely through its relational capabilities, no

matter what additional capabilities the system may support." (Codd, 1990)

That means, no matter what additional features a relational database might support, in order to be

truly

called relational it must comply with the 12 rules.

Codd added this rule in 1990. Also he expanded this 12 rules to 18 to include rules on catalogs, data

types (domains), authorization and other. 2)

Codd himself had to admit the fact that based on the above rules there is no fully relational database

system available. This has not changed since 1990. To be more specific rules 6, 9, 10, 11 and 12 seem to be

difficult to satisfy.

REFERENCES:

1)Codd, E.F. "Is Your DBMS Really Relational?" and "Does Your DBMS Run By the

Rules?" ComputerWorld, October 14 1985 and October 21 1985.

2)Codd, E.F. The Relational Model for Database Management, Version 2;

Addison-Wesley; 1990.

Rate

4 (1)

You rated this post out of 5. Change rating

Share

Share

Rate

4 (1)

You rated this post out of 5. Change rating