Blog Post

Analytic Functions – FIRST_VALUE / LAST_VALUE

,

This post is part of the series discussing the new Analytic functions in SQL Server “Denali”, CTP3.

Analytic Functions in SQL Server “Denali”

All of the new Analytic functions require the use of the OVER clause. To see what changes to the OVER clause that “Denali” brings, please view my article The OVER Clause enhancements in SQL Server code named “Denali”, CTP3 on SQLServerCentral.com.

Introducing the FIRST_VALUE / LAST_VALUE functions

The FIRST_VALUE and LAST_VALUE functions return the first or last value in the ordered set of your partition, optionally applying a windowing clause to limit the rows being considered.

The syntax of these functions is:

FIRST_VALUE | LAST_VALUE ( [scalar_expression )
    OVER ( [ partition_by_clause ] order_by_clause [ rows_range_clause ] )

Note that the order_by_clause is required, and the partition_by_clause and the row_range_clause are optional.

The scalar_expression is the value to be returned – this will normally be a column, but it can also be a subquery or any other expression that results in a single value.

Using the FIRST_VALUE and LAST_VALUE functions

For this example, I’m going to use the AdventureWorks for Denali database, which can be downloaded here. In this database, there is a Sales.SalesOrderHeader table, which has information about each order. This information includes CustomerID, OrderDate, and TotalDue columns.

Let’s run a query that shows, for each CustomerID, the OrderDate for when they placed their least and most expensive orders. What this entails is using the FIRST_VALUE and LAST_VALUE functions to return the OrderDate column. The OVER clause specifies that the PARTITION BY is the CustomerID, the ORDER BY is the TotalDue, and the ROWS clause specifies all rows for that CustomerID. The resulting code is:

SELECT DISTINCT TOP (20)
       CustomerID,
       OrderDateLow  = FIRST_VALUE(OrderDate)
                       OVER (PARTITION BY CustomerID
                                 ORDER BY TotalDue
                                  ROWS BETWEEN UNBOUNDED PRECEDING
                                           AND UNBOUNDED FOLLOWING),
       OrderDateHigh = LAST_VALUE(OrderDate)
                       OVER (PARTITION BY CustomerID
                                 ORDER BY TotalDue
                                  ROWS BETWEEN UNBOUNDED PRECEDING
                                           AND UNBOUNDED FOLLOWING)
  FROM Sales.SalesOrderHeader
 ORDER BY CustomerID;

This code produces the following result set:

CustomerID  OrderDateLow              OrderDateHigh
----------- -----------------------   -----------------------
11000       2007-07-22 00:00:00.000   2005-07-22 00:00:00.000
11001       2008-06-12 00:00:00.000   2005-07-18 00:00:00.000
11002       2007-07-04 00:00:00.000   2005-07-10 00:00:00.000
11003       2007-07-09 00:00:00.000   2005-07-01 00:00:00.000
11004       2007-07-26 00:00:00.000   2005-07-26 00:00:00.000
11005       2007-07-27 00:00:00.000   2005-07-02 00:00:00.000
11006       2007-07-02 00:00:00.000   2005-07-27 00:00:00.000
11007       2007-07-25 00:00:00.000   2005-07-12 00:00:00.000
11008       2007-07-07 00:00:00.000   2005-07-28 00:00:00.000
11009       2007-07-24 00:00:00.000   2005-07-30 00:00:00.000
11010       2007-07-15 00:00:00.000   2005-07-17 00:00:00.000
11011       2007-07-29 00:00:00.000   2005-07-02 00:00:00.000
11012       2008-04-17 00:00:00.000   2007-09-17 00:00:00.000
11013       2007-10-15 00:00:00.000   2008-07-24 00:00:00.000
11014       2007-09-24 00:00:00.000   2007-11-01 00:00:00.000
11015       2007-07-22 00:00:00.000   2007-07-22 00:00:00.000
11016       2007-08-13 00:00:00.000   2007-08-13 00:00:00.000
11017       2008-04-16 00:00:00.000   2005-07-15 00:00:00.000
11018       2008-04-26 00:00:00.000   2005-07-20 00:00:00.000
11019       2008-02-23 00:00:00.000   2007-11-26 00:00:00.000

By repeating the use of the function, with the same OVER clause definition, we can get other columns from those same rows.

How about another example?

In my OVER clause enhancements article linked to above on SSC, there is an example of performing a running total. To demonstrate the LAST_VALUE function, I’m going to enhance that code so that it not only shows the running total, but to also show the percent of total that each transaction amount is per account. This will require getting the last running total value of the TranAmt column for each account.

The code to create the sample data is (copied from the above article):

DECLARE @SampleData TABLE (
  AccountId INTEGER,
  TranDate  DATETIME,
  TranAmt   NUMERIC(8,2));
 
INSERT INTO @SampleData
            (AccountId, TranDate, TranAmt)
     VALUES (1, '20110101', 500),
            (1, '20110115', 50),
            (1, '20110122', 250),
            (1, '20110124', 75),
            (1, '20110126', 125),
            (1, '20110128', 175),
 
            (2, '20110101', 500),
            (2, '20110115', 50),
            (2, '20110122', 25),
            (2, '20110123', 125),
            (2, '20110126', 200),
            (2, '20110129', 250),
 
            (3, '20110101', 500),
            (3, '20110115', 50 ),
            (3, '20110122', 5000),
            (3, '20110125', 550),
            (3, '20110127', 95 ),
            (3, '20110130', 2500);

and the code to perform the running totals and percent of total is:

WITH cte AS
(
SELECT AccountId ,
       TranDate ,
       TranAmt,
       -- running total of all transactions
       RunTotalAmt = SUM(TranAmt)
                     OVER (PARTITION BY AccountId
                               ORDER BY TranDate)
  FROM @SampleData
)
SELECT AccountId,
       TranDate,
       TranAmt,
       RunTotalAmt,
       PercentOfTotal = CONVERT(NUMERIC(5,2), TranAmt * 100 /                        LAST_VALUE(RunTotalAmt)                        OVER (PARTITION BY AccountId                                  ORDER BY TranDate                                   ROWS BETWEEN UNBOUNDED PRECEDING                                            AND UNBOUNDED FOLLOWING))  FROM cte
 ORDER BY AccountId, TranDate;

The highlighted lines show the LAST_VALUE function. The OVER clause is utilizing all three parts: the PARTITION BY is to restart the calculation for each AccountId, the ORDER BY is on TranDate to ensure that I get the proper row for the last value, and the ROWS clause is utilized in order to get the last value from all of the rows for this account – if it hadn’t had been specified, it would have used the default RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT_ROW, and that would have every row’s percent being 100% – not what we want!

This code produces this result set:

AccountId   TranDate                  TranAmt   RunTotalAmt PercentOfTotal
----------- -----------------------   -------   ----------- --------------
1           2011-01-01 00:00:00.000   500.00    500.00      42.55
1           2011-01-15 00:00:00.000   50.00     550.00      4.26
1           2011-01-22 00:00:00.000   250.00    800.00      21.28
1           2011-01-24 00:00:00.000   75.00     875.00      6.38
1           2011-01-26 00:00:00.000   125.00    1000.00     10.64
1           2011-01-28 00:00:00.000   175.00    1175.00     14.89
2           2011-01-01 00:00:00.000   500.00    500.00      43.48
2           2011-01-15 00:00:00.000   50.00     550.00      4.35
2           2011-01-22 00:00:00.000   25.00     575.00      2.17
2           2011-01-23 00:00:00.000   125.00    700.00      10.87
2           2011-01-26 00:00:00.000   200.00    900.00      17.39
2           2011-01-29 00:00:00.000   250.00    1150.00     21.74
3           2011-01-01 00:00:00.000   500.00    500.00      5.75
3           2011-01-15 00:00:00.000   50.00     550.00      0.58
3           2011-01-22 00:00:00.000   5000.00   5550.00     57.50
3           2011-01-25 00:00:00.000   550.00    6100.00     6.33
3           2011-01-27 00:00:00.000   95.00     6195.00     1.09
3           2011-01-30 00:00:00.000   2500.00   8695.00     28.75

As you can see, the PercentOfTotal column calculates the percentage between the current row’s TranAmt and the TranAmt total for that account.

Hopefully, by seeing some examples, you now have some ideas of how you can utilize these new functions.

References:

FIRST_VALUE

LAST_VALUE

Analytic Function Series:

Overview

LAG/LEAD

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating