Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

T-SQL Enhancements: FIRST_VALUE() and LAST_VALUE()

A couple a weeks ago I blogged about a few of the enhancements in the OVER clause, and now I will show you guys a couple of new windowing functions, that goes along with the OVER clause. These new functions are FIRST_VALUE() and LAST_VALUE().

I’ll use the same setup as I did in the OVER clause post:

CREATE TABLE Employees (
    EmployeeId INT IDENTITY PRIMARY KEY,
    Name VARCHAR(50),
    HireDate DATE NOT NULL,
    Salary INT NOT NULL
)
GO
INSERT INTO Employees (Name, HireDate, Salary)
VALUES
    ('Alice', '2011-01-01', 20000),
    ('Brent', '2011-01-15', 19000),
    ('Carlos', '2011-02-01', 22000),
    ('Donna', '2011-03-01', 25000),
    ('Evan', '2011-04-01', 18500)
GO

 

Last time I found the average salary of the entire group, and also controlled the size of the window frame. Now I would like to know the name, salary and hire date of the person with the higest salary. In the good old days we would solve this by using a mess of subqueries, but now we can do this quite elegantly:

SELECT
    EmployeeId,
    Name,
    Salary,
    HireDate,
    LAST_VALUE(Name) OVER(
                    ORDER BY Salary
                    RANGE BETWEEN UNBOUNDED PRECEDING
                    AND UNBOUNDED FOLLOWING) AS HighestSalaryName,
    LAST_VALUE(Salary) OVER(
                    ORDER BY Salary
                    RANGE BETWEEN UNBOUNDED PRECEDING
                    AND UNBOUNDED FOLLOWING) AS HighestSalary,
    LAST_VALUE(HireDate) OVER(
                    ORDER BY Salary
                    RANGE BETWEEN UNBOUNDED PRECEDING
                    AND UNBOUNDED FOLLOWING) AS HiredateOfHighestSalary
FROM Employees
ORDER BY EmployeeId
GO

image

This resultset gives us one row per employee, but with the following columns added: HigestSalaryName, HighestSalary and HireDateOfHigestSalary.

The function LAST_VALUE(x) returns the value of column x in the last row of the window frame defined in the OVER clause. I have defined the ORDER BY clause to order by Salary, so the FIRST_VALUE() will look at the row with the lowest salary, and the LAST_VALUE() will look at the row with the higest salary.

The beauty of these functions is, that we can return any of the columns in the dataset given any order by clause. If we think about the MIN(x) and MAX(x) functions, these can only return the value of column x ordered by column x.

Again we can add the new ROWS or RANGE clause to narrow down the window frame, but in the example above I just wanted to look at the entire dataset as one group.

 

@HenrikSjang

Comments

Posted by Anonymous on 30 August 2011

Pingback from  Dew Drop – August 30, 2011 | Alvin Ashcraft's Morning Dew

Posted by Jason Brimhall on 30 August 2011

Thanks for the nice examples.

Posted by Carlo Romagnano on 31 August 2011

Thank you for the news.

Posted by peter.row on 31 August 2011

Syntax seems to be a bit ugly compared to a simple select ... from employess cross join (select max(...) as highest

When you say "The beauty of these functions is, that we can return any of the columns in the dataset given any order by clause. If we think about the MIN(x) and MAX(x) functions, these can only return the value of column x ordered by column x." is that correct? Using MIN or MAX you would not technically have had to order by anything, would you?

What would have been interesting would be a performance comparision between the example you gave and one without the new syntax.

Posted by Robert L Davis on 31 August 2011

Peter, you can't specify a range of data for a MIN or MAX. Plus, you can't simply add a MIN or MAX to a query with other columns unless you are using a GROUP BY or the MIN/MAX is contained within a subquery, both of which is much much worse for performance.

Posted by Charles Kincaid on 31 August 2011

Then look a geo data.  In a trip where I'm gathering co-ordinate points I need to know the first Lat/Long of the trip and the last Lat/Long of the trip.  So I'd be pulling LAST_VALUE(Coordinate) but the OVEr would be date based.  Without this I'd have to use a CTE to get the first co-ordinate and another to get the last.

Wonderful syntax aid.  Now how about query plans and performaqnce?

Leave a Comment

Please register or log in to leave a comment.