RE: Level 4: The Mathematics of SQL: Part 1

SSC Guru

Points: 104773

September 6, 2011 at 6:35 am

Although I don't much like the article (I was a mathematician before I got involved in computers, and a computer scientist before I was an engineer/developer/architext, so I prefer people to avoid misusing/making incorrect statements about well understood mathematical concepts like sets; if the author had used "bag" instead of "set" it would have been fine, because in maths bags do allow duplicates) and agree with David P that it's not really a great start, I think his other remarks are over the top (misuse of one term is not a total disaster) because the diagrams and some of the other stuff probably will help some people.

It always amuses me when I see the claim that relational algebra is nothing to do with tables (do you know a better graphical representation of a relation, David? I suppose there may be such a thing. And Jeff's pork chop launcher might be made obsolete by the discovery that flying pigs with detachable parts are abundant in Minnesota and can be trained to divebomb people who do things row by row).

As for a tables without keys, with duplicates, and allowing nulls not being a set at all that's certainly true. However, a relation (in the sense of the relational model for data) is not a set either (it can't be: it has a key, which is not anything a set has), so why complain that a table is not a set? And I see the usual unreasoned anti-null nonsense rearig its head there: how on earth could allowing null affect whether something could be a set.

Anyway, I don't believe that getting a mathematically correct definition of of a relation (in the relational data model sense) will be of much use to anyone, not even to a mathematician or a theoretical computer scientist (although I've done it several times in different forms, just for fun). How useful is it to the average DBA or DB developer to know that in the relational data mathematical world we can safely go at defining a relation by something like the following:

edit (some time later): edit: now that David declined the error detect challenge, I can say that I intended to have one; but I can't convince myself that it worked - it looks as if there isn't one. careless of me. it just shows how silly a mathematician can get when playing at mathematics this abstract. Reminds me of Martin Gardner's statement that "Pure mathematics is very like the grin without the cat".

[/i]

Let S be a finite map from names (text strings) to datadomains,

O1 be a total order on S,

K be a submap of S,

O2 be a total order on K, and

[D] be the tuple <S, O1, K, O2>.

Then the relation type RT with definition D is the tuple <D, P> where

P is a totally recursive function on positive integers to ordered maps of type S with order O1 such that for all integers n the cardinality of the map set P(n) restricted to K is the same as the cardinality of P(n), and there is no proper submap of K for which this holds true, and no superset of P for which this holds true.

A relation R of type RT is a tuple <RT, V> where V is finite map from time to positive integers.

At time t the value of R at time t is the tuple <D, P(V(t))>.

Of course that only works when each relation has exactly one key, and all the business rules to be enforced by the keys are enforced ny the primary key. It gets more complicated when there are additional candidate keys.

I don't think anything like that is the slightest bit useful, unless one wants to get into machine driven theorem proving in which case this sort of definition (in far more complex form, to cover everything) might conceivably turn out to be useful. (Incidentally, David, can you tell wether I put a deliberate mistake in that definition or not? If not, why do you complain about someone talking tables instead of mathematically correct definitions?)

Tom