Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Celko's SQL Stumper: The Data Warehouse Problem

By Joe Celko,

Joe Celko comes back with a puzzle that isn't new, but one where the answer he originally gave now seems archaic: It is a deceptively simple problem, but is it true that the new features of SQL have simplified the solution? We leave it to the readers to provide the answer!

Oldies but Not So Goodies

There was a posting a SQL Server Newsgroup a short while ago by a regular who was proud that his old T-SQL was still running today.  He saw it as a tribute to MS providing upward compatibility in their 4GL dialect and his skill.  Of course, there were problems when *= was deprecated, BIT became a numeric type and a few other things happened to the SQL engine as it moved from dialect to something closer to Standard SQL. 

I appreciate the sentiment, but I would consider the fact of my old T-SQL still running as a problem rather than a brag.  In those days, we used triggers where we now have DRI actions.  We wrote a lot of IF-THEN-ELSE logic where we now have CASE expressions.  We nested subqueries where we now have an OVER() clause.  We truncated DATETIME to zero hour where we now have a native DATE data type. The ISO-11179 naming conventions did not exist. 

I cannot go back and modernize my old procedures.  The companies that were failures or got bought up are not around.  The companies that were successful have migrated code already – which is part of why they are successful. 

But I do have my old puzzle columns to haunt me.  The clever code of a decade or more ago might still work today, but it ought to be replaced with a current release and rewritten with the new features.  Let me pull up one of these old puzzles and its old solution. I have cleaned up the code just a little bit, so the data element names are better and you can cut & paste it easily.  If you have been around for awhile, you will recognize the old design patterns we used when certain features were not available.  Your task is to write new answers with the current features available in SQL Server. 

The Data Warehouse Problem

This one came in as a data warehouse problem in 1999. You have a history table of customer daily total purchases that looks like this:

CREATE TABLE DailySalesTotals

(customer_id CHAR(10) NOT NULL,

 order_date DATE NOT NULL,

 order_amt DECIMAL(8,2) NOT NULL,

 PRIMARY KEY (customer_id, order_date));

Which we can fill with some dummy data.

INSERT INTO DailySalesTotals (customer_id, order_date, order_amt)

VALUES  ('Celko', '1999-11-28', 450.00),

 ('Curly', '1999-11-25', 400.00),

 ('Curly', '1999-11-26', 300.00),

 ('Curly', '1999-11-27', 400.00),

 ('Curly', '1999-11-28', 450.00),

 ('Larry', '1999-11-25', 400.00),

 ('Larry', '1999-11-26', 400.00),

 ('Larry', '1999-11-27', 450.00),

 ('Larry', '1999-11-28', 400.00),

 ('Moe', '1999-11-25', 400.00),

 ('Moe', '1999-11-26', 400.00),

 ('Moe', '1999-11-27', 400.00),

 ('Moe', '1999-11-28', 400.00);

The problem is to report just those  customers who decreased their purchase amounts on their most recent order placed with us.  We are trying to get an idea when people are saturated with whatever we are selling.  If their order level is holding steady we are happy with them.  I came up with this query back then:

 

SELECT H1.customer_id, ' dropped purchase amount on ',

MAX(H1.order_date)

  FROM DailySalesTotals AS H1

 WHERE H1.order_amt

       < (SELECT H2.order_amt

            FROM DailySalesTotals AS H2

           WHERE H1.customer_id = H2.customer_id

             AND H2.order_date

                 = (SELECT MAX(order_date)

                      FROM DailySalesTotals AS H3

                     WHERE H1.customer_id = H3.customer_id

                      AND H1.order_date > H3.order_date

                   )

           )

  AND H1.order_date = (SELECT MAX( order_date)

                          FROM DailySalesTotals h4

                         WHERE h4.customer_id = H1.customer_id

                        )

 GROUP BY customer_id;

The nested subquery says that the order amount has dropped and then uses another subquery within itself to ask if the date is the most recent date on file for that customer.  Nesting correlated subqueries is usually expensive, so we want to avoid that. 

What is your answer in one SQL Statement? I have included a script with some larger volumes of test data sufficient to compare the timings of both solutions. 

The best answer to each stumper will be given a prize of a $100 Amazon voucher. The stumper will be run simultaneously on SQL Server Central and Simple-Talk. To see all the comments so far, you will need to visit both sites.

 We will take entries for a week after the first Monday of publication, posted as comments to the articles.

Two weeks after the challenge is sent out, the judge's decision and comments will be sent out in the newsletter, and published on the site. Joe Celko and Phil Factor will judge the answers to this puzzle. Your answer should :
  1) Solve the problem -- Duh!
  2) Avoid proprietary features in SQL Server that will not port or be good across all releases, present and future.
  3) Use Standard features in SQL Server that will be good across all releases, present and future. Extra points for porting code.
  4) Be clever but not obscure.
  5) Explain how you got your answer.

Resources:

CelkoWarehousePuzzleData.zip
Total article views: 3339 | Views in the last 30 days: 5
 
Related Articles
FORUM

Customize SQL Query

Customize

FORUM

find out the customers having maximum number of unique orders.

find out the customers having maximum number of unique orders.

FORUM

Adding a trigger for new Customer Order

I need a trigger for when a new Customer Order Header (table) is added, then add a new record onto a...

BLOG

Column alias in order by and where

If you ever wonder why column alias can be used in order by and not in where clause, it’s the...

Tags
puzzle    
sql puzzles    
t-sql    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones