SQLServerCentral Article

Linking to the Previous Row

,

Two of the coolest additions to T-SQL in SQL 2005 are Common Table Expressions (CTEs), and the Row_number() function. In this article we're going to see how you can use these two features together to provide an elegant solution to an age old problem.

First an extremely quick look at each of them.

Common Table Expressions

A CTE is a temporary result set, which is valid for the scope of a single Select Update Insert or Delete statement. I've heard it said that this very restricted scope seriously limits their usefulness, and that it's only function is to improve the readability of your SQL. (Writing recursive queries is the obvious area where CTEs come in to play - but I won't be covering this in this article.)

I hope to show in this article that the CTEs usefulness is not purely cosmetic.

One key point that is often overlooked is that the resultset can be referenced multiple times in the S/U/I/D statement. We'll see this fact being put to use later on.

Row_number()

The row_number() function has been on most SQL Developers Christmas list for many years now. It provides a sequential row number for each record in a result set. An ordering must be specified (to give a basis to the numbering), and optionally a partition may be specified to split the result set numbering. We'll look more closely at what this means with our example.

The Price History example

Our retail client is storing a history of price changes in a table PriceHistory. For each item (in the Items table) we can see the initial price, and an additional record for each subsequent price change. The table creation script is available below in the Resources section.

Data from the PriceHistory table

The client would like to see a report showing the Item name, the old price, the new price and the date range for which the new price was applied. This type of question has often given headaches to sql developers, as it usually involved messy subqueries using the max / min functions. Some DBAs prefer to store both the start date and end date in the table, but this too causes problems, as it is difficult to keep subsequent records in synch, especially when a modification occurs.

ItemOld PriceRange PriceStart DateEnd Date
vacuum cleaner 250.002004-03-012005-06-15
vacuum cleaner250.00219.992005-06-152007-01-03
vacuum cleaner219.99189.992007-01-032007-02-03
vacuum cleaner189.99200.002007-02-03 
washing machine 650.002006-07-122007-01-03
washing machine650.00550.002007-01-03 
toothbrush 1.992005-01-012006-01-01
toothbrush1.991.792006-01-012007-01-01
toothbrush1.791.592007-01-012008-01-01
toothbrush1.591.492008-01-01 

The format we're aiming for

If we examine one line from the target report, we can see that it includes information from 3 consecutive rows in the table. The key to solving this problem lies in linking the current row to the previous row and to the next row.

vacuum cleaner219.99189.992007-01-032007-02-03

 

ItemIdPriceStartDatePrice
12004-03-01250.00
12005-06-15219.99
12007-01-03189.99
12007-02-03200.00
22006-07-12650.00
22007-01-03550.00
32005-01-011.99
32006-01-011.79
32007-01-011.59
32008-01-011.49

The first step is to construct a CTE which numbers the rows of the data table.  (Note that the semi-colon is required when the CTE is not the first statement in a batch.)

;WITH PriceCompare AS (
SELECT i.Item, ph.ItemId, ph.PriceStartDate, ph.Price,
ROW_NUMBER() OVER (Partition BY ph.ItemId ORDER BY PriceStartDate) AS rownum 
FROM Items i INNER JOIN PriceHistory ph 
ON i.ItemId = ph.ItemId) 
SELECT * FROM PriceCompare 

In the CTE, I add a new column, rownum, whose value comes from the row_number() function. You can see that I've partitioned by ItemId which means that numbering recommences with each new item. I've ordered on the PriceStartDate column, to tell the CTE how it should apply the numbering i.e. the earliest PriceStartDate for an item will have number 1 etc..

ItemItemIdPriceStartDatePricerownum
vacuum cleaner12004-03-01250.001
vacuum cleaner12005-06-15219.992
vacuum cleaner12007-01-03189.993
vacuum cleaner12007-02-03200.004
washing machine22006-07-12650.001
washing machine22007-01-03550.002
toothbrush32005-01-011.991
toothbrush32006-01-011.792
toothbrush32007-01-011.593
toothbrush32008-01-011.494

Now, I'll enhance the SELECT statement which follows the CTE.

SELECT currow.Item, prevrow.Price AS OldPrice, currow.Price AS RangePrice, currow.PriceStartDate AS StartDate, nextrow.PriceStartDate AS EndDate 
FROM PriceCompare currow 
LEFT JOIN PriceCompare nextrow 
        ON currow.rownum = nextrow.rownum - 1
        AND currow.ItemId = nextrow.ItemId
LEFT JOIN PriceCompare prevrow
        ON currow.rownum = prevrow.rownum + 1
        AND currow.ItemId = prevrow.ItemId

Note that I use Left Joins as for the first row for an item there is no previous row, just as the last row has no next row.

I use the aliases before each field to denote which row the data should come from. Thus PriceStartDate from the current row is the Start Date while the same field from the next row is the End Date.

Running the query now, gives the resultset required for the Price History report. An additional bonus is that you can wrap up the whole CTE in a view!

                        CREATE VIEW [dbo].[PriceCompare] AS
WITH PriceCompare AS 
(
SELECT i.Item, ph.ItemId, ph.PriceStartDate, ph.Price, 
ROW_NUMBER() OVER (Partition BY ph.ItemId ORDER BY PriceStartDate) AS rownum 
FROM 
        Items i 
INNER JOIN 
        PriceHistory ph 
ON i.ItemId = ph.ItemId
)
 SELECT
        currow.Item, 
        prevrow.Price AS OldPrice, 
        currow.Price AS RangePrice, 
        currow.PriceStartDate AS StartDate, 
        nextrow.PriceStartDate AS EndDate 
FROM 
        PriceCompare currow 
LEFT JOIN PriceCompare nextrow 
        ON currow.rownum = nextrow.rownum - 1 AND currow.ItemId = nextrow.ItemId 
LEFT JOIN PriceCompare prevrow 
        ON currow.rownum = prevrow.rownum + 1 AND currow.ItemId = prevrow.ItemId
                        

I hope to have shown with this article why you should be using CTEs in your applications, and not just as a way to tidy up your SQL.

Resources

Rate

4.71 (321)

You rated this post out of 5. Change rating

Share

Share

Rate

4.71 (321)

You rated this post out of 5. Change rating