Of those a really good article because people are not aware of problems with statistics. The classic book on this is "how to lie with statistics." By Darrell Huff. It is old and has stayed in print since the nineteen fifties. Another good book is Stefan K Campbell's "flaws and fallacies in statistical thinking", which is along the same lines. A quick introduction to the computational side is "how to think about statistics (revised edition)" by John L Phillips Jr.
My only complaint the article is that you did not post a table. A table by definition has to have a key, in a properly designed table should not have redundancies and it. A well-designed table would follow ISO 1179 rules, industry standards, and other things. Do you really need 50 character long strings? Why do you believe there is a generic "id" in RDBMS? Why are you using floating-point for money (that is actually illegal)? Why do you have both agent and age cohort in the same table? We do not store computed data, but if you wanted to do it then you would have needed to have a check constraint to make sure that they are actually redundant instead of conflicting. Which ethnicity code did you use? Etc. I know you just dashed this out in a hurry, but it just bothers me to see such bad SQL coding.
CREATE TABLE Paradox_Data
(sample_id CHAR(5) NOT NULL PRIMARY KEY,
birth_date DATE NOT NULL,
sex_code CHAR (1) NOT NULL
CHECK (sex_code IN (‘0’, ‘1’, ’2’, ‘9’),
expenditure_amt DECIMAL (12,2) NOT NULL,
ethnicity_code CHAR(5) NOT NULL);
Again a good article and it might be worth doing another follow-up on some other statistical paradoxes.
Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL