Thank this author by sharing:
By Joe Celko,
What can we use in SQL instead of E. F. Codd's T theta operators for best-fit? Joe Celko returns with another puzzle that isn't new, in fact it already features “Swedish”, “Croatian” and “Colombian” solutions in chapter 17 of Joe's 'SQL for Smarties' book. These were all written before CTEs or the new WINDOW functions. Is there now a better solution? Was there one even then? We leave it to the readers to provide the answer!
In the book “The Relational Model, Version Two” (1990, ISBN13: 978-0201141924), Dr. E. F. Codd introduced a set of new theta operators, called T-operators, which were based on the idea of a best-fit or approximate equality. The name never caught on and nobody invented special syntax for them.
One problem is that they were defined by a procedural algorithm and not a set-oriented definition. It is easier to understand with an example modified from Dr. Codd.
The problem is to assign the classes to the available classrooms. We want (class_size < room_size) to be true after the assignments are made. This will allow us a few empty seats in each room for late students. We can do this in one of two ways. The first way is to sort the tables in ascending order by classroom size and the number of students in a class. We start with the following simplified tables:
CREATE TABLE Rooms
(room_nbr CHAR(2) NOT NULL PRIMARY KEY,
room_size INTEGER NOT NULL
CHECK (room_size > 0));
CREATE TABLE Classes
(class_nbr CHAR(2) NOT NULL PRIMARY KEY,
class_size INTEGER NOT NULL
CHECK (class_size > 0));
These tables have the following rows in them:
INSERT INTO Classes (class_nbr, class_size)
VALUES ('c1', 80),
INSERT INTO Rooms (room_nbr, room_size)
VALUES ('r1', 70),
The goal of the T-JOIN problem is to assign a class which is smaller than the classroom given it (class_size < room_size). Dr. Codd gives two approaches to the problem.
Sort both tables in ascending order. Reading from the top of the Rooms table, match each class with the first room that will fit.
Classes Roomsclass_nbr class_size room_nbr room_size==================== ==================='c6' 40 'r5' 30'c5' 50 'r2' 40'c4' 55 'r3' 50'c3' 65 'r7' 55'c2' 70 'r6' 65'c1' 80 'r1' 70
This gives us ...
Resultsclass_nbr class_size room_nbr room_size======================================='c2' 70 'r4' 85'c3' 65 'r1' 70'c4' 55 'r6' 65'c5' 50 'r7' 55'c6' 40 'r3' 50
Sort both tables in descending order. Reading from the top of the Classes table, match each class with the first room that will fit.
Classes Roomsclass_nbr class_size room_nbr room_size======================================='c1' 80 'r4' 85'c2' 70 'r1' 70'c3' 65 'r6' 65'c4' 55 'r7' 55'c5' 50 'r3' 50'c6' 40 'r2' 40
Resultsclass_nbr class_size room_nbr room_size======================================='c1' 80 'r4' 85'c3' 65 'r1' 70'c4' 55 'r6' 65'c5' 50 'r7' 55'c6' 40 'r3' 50
Notice that the answers are different! Dr. Codd has never given a definition in relational algebra of the T-Join, so I proposed that we need one. Informally, for each class, we want the smallest room that will hold it, while maintaining the T-JOIN condition. Or for each room, we want the largest class that will almost fill it, while maintaining the T-JOIN condition. These can be two different things, because these are not "best fit" solutions "first fit" approach.
Other theta conditions can be used in place of the "less than" shown here. If "less than or equal" is used, all the classes are assigned to a room in this case, but not in all cases. Nor is the solution always unique. Here is a (class_size <= room_size) answer for this data. Room 'r5' is too small to match any class.
Resultsclass_nbr class_size room_nbr room_size========================================'c1' 80 'r4' 85'c2' 70 'r1' 70'c3' 65 'r6' 65'c4' 55 'r7' 55'c5' 50 'r3' 50'c6' 40 'r2' 40
In the third edition of SQL FOR SMARTIES, I gave “Swedish”, “Croatian” and “Colombian” solutions, named for the countries of the submitters. They were all written without CTEs or the new WINDOW functions (that is the technical name for the OVER() clause on aggregate functions). Obviously, if two rooms or two classes are the same size, they can be swapped with each other. Let's make life easier by adding a UNIQUE constraint to room_size and class_size.
The problem is write an SQL statement that gives a T-Join.
WARNING: some of you already know this, but finding an optimal solution is an NP-complete problem. The bigger the data set gets, the effort it takes to solve the problem increase beyond any reasonable limit. You can Google for "Knapsack Problem" or "Bin Packing Problem" if you want the details.
The best answer to each stumper will be given a prize of a $100 Amazon voucher. The stumper will be run simultaneously on SQL Server Central and Simple-Talk. To see all the comments so far, you will need to visit both sites. We will take entries for a week after the first Monday of publication, posted as comments to the articles. Two weeks after the challenge is sent out, the judge's decision and comments will be sent out in the newsletter, and published on the site. Joe Celko and Phil Factor will judge the answers to this puzzle. Your answer should :Your answer should :
The first Pragmatic Works Foundation class concluded today and I think it was a great success! Each...
In today's world, technology workers need to point out solutions, not problems.
Andy Leonard (b | t | l | f) of Linchpin People, has announced a new training class called ...
SQL Server does an exceptional job at managing your data and making it available for your users and ...
Auto answer for popular problems
As a member of SQLServerCentral, you get free access to loads of fresh content: thousands
of articles and SQL scripts, a library of free eBooks, a weekly database news roundup,
a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals
that makes it such a success.