Windows Functions: Tell me when that changes.

Question

Post reply

Windows Functions: Tell me when that changes.

michael.f.morin

SSC-Addicted

Points: 406
More actions
February 16, 2016 at 10:50 pm

#308639

Comments posted to this topic are about the item Windows Functions: Tell me when that changes.

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply

hartmann 74688 Old Hand Points: 346 More actions · Answer 1

Just to make it clear, they are windowing functions or window functions.

Windows functions sounds like API of the OS :).

Good Article, great reminder!

Alan Burstein SSC Guru Points: 61152 More actions · Answer 2

hartmann 74688 (2/17/2016)
Just to make it clear, they are windowing functions or window functions.
Windows functions sounds like API of the OS :).
Good Article, great reminder!

I noticed that too (and have made the same mistake). I don't know the SSC protocol for correcting articles but I hope that gets sorted out so this excellent article has a better chance of showing up earlier in a Google search.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

michael.f.morin SSC-Addicted Points: 406 More actions · Answer 3

Thanks for the comments guys. Nomenclature is important and I will see what SSC can do to update the article.

AdamE SSC Enthusiast Points: 131 More actions · Answer 4

Alternative method for those without LEAD() and LAG()

WITH cteSAMPLEDATA

AS (--Create Sample data

SELECT ID, ORDER_ID, CODE, ORDER_DATE FROM

(VALUES

(1, 'ORDER1', 'DS', N'20151001'),

(2, 'ORDER2', 'DS', N'20151001'),

(5, 'ORDER2', 'DS', N'20151002'),

(3, 'ORDER3', 'DS', N'20151001'),

(6, 'ORDER3', 'AG', N'20151002'),

(8, 'ORDER3', 'AG', N'20151003'),

(4, 'ORDER4', 'DS', N'20151001'),

(7, 'ORDER4', 'AG', N'20151002'),

(9, 'ORDER4', 'DS', N'20151003')

)

AS MYTABLE(ID, ORDER_ID, CODE, ORDER_DATE)

)

SELECT A.ID

,A.ORDER_ID

,A.CODE

,A.ORDER_DATEAS ORDER_DATE

,B.IDAS PREVIOUS_ID

,B.CODEAS PREVIOUS_CODE

,B.ORDER_DATEAS PREVIOUS_ORDERDATE

FROM (

SELECT DISTINCT X.*

FROM cteSAMPLEDATA C

CROSS APPLY(

SELECT TOP 1 *

FROM cteSAMPLEDATA Z

WHERE Z.ORDER_ID = C.ORDER_ID

ORDER BY Z.ORDER_DATE DESC

) X

) A

CROSS APPLY(

SELECT TOP 1 *

FROM cteSAMPLEDATA Z

WHERE Z.ORDER_ID = A.ORDER_ID

AND Z.ORDER_DATE<A.ORDER_DATE

ORDER BY Z.ORDER_DATE DESC

) B

WHERE A.CODE<>B.CODE

dgreen-1126628 SSC-Addicted Points: 447 More actions · Answer 5

Great article. And since you are going to clean it up for "Windows" vs "Windowing", you can clean up the following typos, too 🙂

1. "Meaning it is the only order whose most recent date (2015-10-03) has a different code (DS) that the one before it (AG) from 2015-10-02"

("that" should be "than")

2. “(note this is for SQL Server 2012+ and Azure):

“Accesses data from a subsequent row in the same result set without the use of a self-join in SQL Server 2016. ..."

(confusing you quote from SQL Server 2016 when saying it is ok in 2012+)

3. "Since our rows as sorted by ORDER_DATE descending, that would be what we want." ("as" should be "are")

4. "LEAD(H1.CODE,1) --Get the code 1 row after the current row.

OVER (PARTITION BY ORDER_ID

ORDER BY ORDER_DATE DESC) AS PREVIOUS_CODE"

"LEAD(H1.CODE,1) --Get the code 1 row before the current row.

OVER (PARTITION BY ORDER_ID

ORDER BY ORDER_DATE DESC) AS PREVIOUS_CODE" (first comment says "after" but second occurance says "before")

timwell SSCertifiable Points: 5315 More actions · Answer 6

Thanks Michael for doing this article. I thought it was a good explanation and example.

However I was also distracted by the window/windows difference and some typos / misspellings.

Here are a couple more I found:

"They wanted to know when the most resent order’s code was different from the one previous to it." (I think you meant "recent," not that it was re-sent. It seems to be correct everywhere else)

"In this case we want to partition our recordset by ORDER_ID, essentially grouping all rows with the sam ORDER_ID together."

(sam/same)

Stephen Rybacki Hall of Fame Points: 3402 More actions · Answer 7

Great content, other than the typos and nomenclature things others have already mentioned. Thanks!

j-1064772 SSCertifiable Points: 5316 More actions · Answer 8

Re: Alternative method for those without LEAD() and LAG()

The original code WITH LEAD() and LAG() seems easier to read for me.

The alternative method would apply in a version of SSQL which does not support LEAD or LAG. Otherwise, time to look at using the new features.

Maybe someone who is an old hand at this could compare the actual execution plans to see which method scales better ?

davidawest00 SSC Enthusiast Points: 123 More actions · Answer 9

I have encountered this precise scenario a dozen times.

You can do the same thing with a correlated sub-query, and they usually scale better than any CTE because a CTE is just an in-line view, which SQL Server will generate first.

In other words, SS will create ALL THE ROWS for a CTE before it processes the main query, whether it will need all those rows or not, and you can't put an index on a CTE.

Slow and disk intensive. Bleh! Better to create 2 temp tables and index them: 1 for the current order and 1 for the previous order.

Then join. Last time I did that the stored proc went from 2 hours to 20 minutes.

j-1064772 SSCertifiable Points: 5316 More actions · Answer 10

davidawest00 (2/17/2016)
I have encountered this precise scenario a dozen times.
You can do the same thing with a correlated sub-query, and they usually scale better than any CTE because a CTE is just an in-line view, which SQL Server will generate first.
In other words, SS will create ALL THE ROWS for a CTE before it processes the main query, whether it will need all those rows or not, and you can't put an index on a CTE.
Slow and disk intensive. Bleh! Better to create 2 temp tables and index them: 1 for the current order and 1 for the previous order.
Then join. Last time I did that the stored proc went from 2 hours to 20 minutes.

Good explanation.

Just out of curiousity - how many records took 20 minutes ?

davidawest00 SSC Enthusiast Points: 123 More actions · Answer 11

It was to get the latest + previous purchase per customer...about 40 million out of 60-80 million rows...twice.

Once for current and once for previous.

Luis Cazares SSC Guru Points: 183706 More actions · Answer 12

These articles with real life examples are great to help people get rid of cursors and change the mentalities.

I just wanted to make 2 comments:

For sample data, create a table (even a temp table) to prevent repeating the creation of sample data each time to concentrate in the real solution. This will also help define the correct data types.

CREATE TABLE MYTABLE(

ID int,

ORDER_ID char(6),

CODE char(2),

ORDER_DATE date

);

INSERT INTO MYTABLE

VALUES

(1, 'ORDER1', 'DS', '20151001'),

(2, 'ORDER2', 'DS', '20151001'),

(5, 'ORDER2', 'DS', '20151002'),

(3, 'ORDER3', 'DS', '20151001'),

(6, 'ORDER3', 'AG', '20151002'),

(8, 'ORDER3', 'AG', '20151003'),

(4, 'ORDER4', 'DS', '20151001'),

(7, 'ORDER4', 'AG', '20151002'),

(9, 'ORDER4', 'DS', '20151003');

Second, I found that there might be an alternative that can give a simpler plan. I'm not sure if that means better performance, but it usually helps. The secret is to reuse your windows as much as possible.

WITH cteGET_MAX_DATE_AND_CODE

AS (

SELECT

H1.ID,

H1.ORDER_ID,

H1.CODE,

H1.ORDER_DATE,

ROW_NUMBER() --Identify the latest row for each ORDER

OVER (PARTITION BY ORDER_ID ORDER BY ORDER_DATE DESC) AS ROW_NUM,

LEAD(H1.CODE,1) --Get the code 1 row before the current row.

OVER (PARTITION BY ORDER_ID

ORDER BY ORDER_DATE DESC) AS PREVIOUS_CODE

FROM MYTABLE H1

)

SELECT

ID, ORDER_ID, CODE, PREVIOUS_CODE, ORDER_DATE

FROM cteGET_MAX_DATE_AND_CODE

WHERE

ROW_NUM = 1

AND CODE != PREVIOUS_CODE

AND PREVIOUS_CODE IS NOT NULL;

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

AdamE SSC Enthusiast Points: 131 More actions · Answer 13

Agreed, always look to use new features.

We have a couple of old 2008R2 servers and are sometimes forced to go old school on a few queries, hence my post.

I'd be interested in the analysis of the execution plans if someone wants to step up?

j-1064772 SSCertifiable Points: 5316 More actions · Answer 14

The execution plans do not tell the whole story. I have seen some cases by Gila Monster where the execution was simplified but the load on the server was not improved (roughly same execution time.