The SQL Server LAG Function

Mike Byrd, 2019-01-21 (first published: 2019-01-14)

The LAG function (as well as many other window functions) first appeared in SQL Server 2012. When it first appeared I thought, well, that’s a function I probably will not use, but just recently I used it in a bench-marking project for my new 2019 SQL Saturday index presentation. It just hit me that I’m so wrapped up in doing something the old way that I completely forgot about a better way to accomplish the same task with less resources and IO.

Microsoft describes the LAG function as accessing “data from a previous row in the same result set without the use of a self-join. LAG provides access to a row at a given physical offset that comes before the current row.” It turns out it is very useful for running totals and calculating differences between rows.

In my project, the problem I was trying to benchmark was a query that computes the date of the customer’s previous order and computes the difference between the previous order date and the current order date, using the AdventureWorks2012Big database. AdventureWorks2012Big is an AdventureWorks2012 database modified by a script from Jonathan Keyayias (http://sqlskills.com/blogs/jonathan) that bumps up the data in the Sales.SalesOrderHeader table from 31,465 rows to 1,290065 rows.

The tests below were run with SQL Server as described from SELECT @@VERSION :

Microsoft SQL Server 2017 (RTM-CU12) (KB4464082) - 14.0.3045.24 (X64)

Oct 18 2018 23:11:05

Developer Edition (64-bit) on Windows 10 Pro 10.0 <X64> (Build 17134: )

So, my original query was

And when executed with the original AdventureWorks2012 indexes (clustered and nonclustered) had this execution plan:

There are obviously two separate scans of the clustered index on Sales.SalesOrderHeader.

After looking at the above query plan I got to cogitating and remembered the LAG function. After looking up the syntax (web link above) I rewrote the query as

The results were identical to the original query, but with a much nicer query plan that looked like this:

This plan has only one clustered indexscan (much less logical IO) of Sales.SalesOrderHeader.

Comparing the performance numbers from each query gave me this:

	Query 1	Query 2 (LAG)
Scan Count	10	0
Logical Reads	59,842	30,022	pages
CPU time	123,827	5,499	ms
Elapsed time	28,209	8,246	ms
Query Cost	15,731.6	63.0

The performance of the original query took about 28 seconds, but with the LAG function it took about 8 sec. The real performance gain was in the logical reads and cpu time. These kinds of performance gains can really catch the attention of a manager waiting for a report.

I bring this up because sometimes we get stuck in our ways and forget the new stuff. Once again this old dog re-learned a new trick.

Identifying Start Dates not Aligned with End Dates on a Prior Row

by Dwain Camps

SQLServerCentral.com

T-SQL

When effective end dates don't align properly with effective start dates for subsequent rows, what are you to do?

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(13)

You rated this post out of 5. Change rating

2015-03-24

10,618 reads

Discuss

Cursor-Killing: Retrieving Recently Modified Data

by Edward Pollack

SQLServerCentral.com

cursors

Cursors are considered by many to be the bane of good T-SQL. What are the best ways to avoid iterative T-SQL and to write queries that look and perform beautifully? In the next part of an ongoing series, we look at ways to efficiently retrieve recently modified data.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(42)

You rated this post out of 5. Change rating

2014-06-02

7,074 reads

Discuss

Divide and Conquer - Performance Tuning

by Claire Mora

SQLServerCentral.com

T-SQL

Sometimes we need to break down a complex problem into a multiple stage solution to achieve optimal performance

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

(20)

You rated this post out of 5. Change rating

2014-04-08

8,547 reads

Discuss

Join Operations – Hash Match

by Jason Brimhall

SQLServerCentral.com

The way in which SQL Server chooses to join your tables in a query can dramatically affect performance. In this article, Jason Brimhall explains how a hash match works and shows some performance numbers.

★ ★ ★ ★ ★ ★ ★ ★ ★ ★

You rated this post out of 5. Change rating

2016-05-13 (first published: 2014-03-04)

22,769 reads

Discuss

The Performance of the T-SQL Window Functions

by Additional Articles

SimpleTalk

T-SQL

Window Functions in SQL greatly simplify a whole range of financial and statistical aggregations on sets of data. Because there is less SQL on the page, it is easy to assume that the performance is better too: but is it? Dwain gets out the test harness to investigate.

2014-02-06

5,354 reads

The SQL Server LAG Function

Rate

Share

Categories

Share

Rate

The SQL Server LAG Function

Rate

Share

Categories

Share

Rate

Related content

Identifying Start Dates not Aligned with End Dates on a Prior Row

Cursor-Killing: Retrieving Recently Modified Data

Divide and Conquer - Performance Tuning

Join Operations – Hash Match

The Performance of the T-SQL Window Functions