Windows Functions: Tell me when that changes.

  • Comments posted to this topic are about the item Windows Functions: Tell me when that changes.

  • Just to make it clear, they are windowing functions or window functions.

    Windows functions sounds like API of the OS :).

    Good Article, great reminder!

  • hartmann 74688 (2/17/2016)


    Just to make it clear, they are windowing functions or window functions.

    Windows functions sounds like API of the OS :).

    Good Article, great reminder!

    I noticed that too (and have made the same mistake). I don't know the SSC protocol for correcting articles but I hope that gets sorted out so this excellent article has a better chance of showing up earlier in a Google search.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Thanks for the comments guys. Nomenclature is important and I will see what SSC can do to update the article.

  • Alternative method for those without LEAD() and LAG()

    WITH cteSAMPLEDATA

    AS (--Create Sample data

    SELECT ID, ORDER_ID, CODE, ORDER_DATE FROM

    (VALUES

    (1, 'ORDER1', 'DS', N'20151001'),

    (2, 'ORDER2', 'DS', N'20151001'),

    (5, 'ORDER2', 'DS', N'20151002'),

    (3, 'ORDER3', 'DS', N'20151001'),

    (6, 'ORDER3', 'AG', N'20151002'),

    (8, 'ORDER3', 'AG', N'20151003'),

    (4, 'ORDER4', 'DS', N'20151001'),

    (7, 'ORDER4', 'AG', N'20151002'),

    (9, 'ORDER4', 'DS', N'20151003')

    )

    AS MYTABLE(ID, ORDER_ID, CODE, ORDER_DATE)

    )

    SELECT A.ID

    ,A.ORDER_ID

    ,A.CODE

    ,A.ORDER_DATEAS ORDER_DATE

    ,B.IDAS PREVIOUS_ID

    ,B.CODEAS PREVIOUS_CODE

    ,B.ORDER_DATEAS PREVIOUS_ORDERDATE

    FROM (

    SELECT DISTINCT X.*

    FROM cteSAMPLEDATA C

    CROSS APPLY(

    SELECT TOP 1 *

    FROM cteSAMPLEDATA Z

    WHERE Z.ORDER_ID = C.ORDER_ID

    ORDER BY Z.ORDER_DATE DESC

    ) X

    ) A

    CROSS APPLY(

    SELECT TOP 1 *

    FROM cteSAMPLEDATA Z

    WHERE Z.ORDER_ID = A.ORDER_ID

    AND Z.ORDER_DATE<A.ORDER_DATE

    ORDER BY Z.ORDER_DATE DESC

    ) B

    WHERE A.CODE<>B.CODE

  • Great article. And since you are going to clean it up for "Windows" vs "Windowing", you can clean up the following typos, too 🙂

    1. "Meaning it is the only order whose most recent date (2015-10-03) has a different code (DS) that the one before it (AG) from 2015-10-02"

    ("that" should be "than")

    2. “(note this is for SQL Server 2012+ and Azure):

    “Accesses data from a subsequent row in the same result set without the use of a self-join in SQL Server 2016. ..."

    (confusing you quote from SQL Server 2016 when saying it is ok in 2012+)

    3. "Since our rows as sorted by ORDER_DATE descending, that would be what we want." ("as" should be "are")

    4. "LEAD(H1.CODE,1) --Get the code 1 row after the current row.

    OVER (PARTITION BY ORDER_ID

    ORDER BY ORDER_DATE DESC) AS PREVIOUS_CODE"

    "LEAD(H1.CODE,1) --Get the code 1 row before the current row.

    OVER (PARTITION BY ORDER_ID

    ORDER BY ORDER_DATE DESC) AS PREVIOUS_CODE" (first comment says "after" but second occurance says "before")

  • Thanks Michael for doing this article. I thought it was a good explanation and example.

    However I was also distracted by the window/windows difference and some typos / misspellings.

    Here are a couple more I found:

    "They wanted to know when the most resent order’s code was different from the one previous to it." (I think you meant "recent," not that it was re-sent. It seems to be correct everywhere else)

    "In this case we want to partition our recordset by ORDER_ID, essentially grouping all rows with the sam ORDER_ID together."

    (sam/same)

  • Great content, other than the typos and nomenclature things others have already mentioned. Thanks!

  • Re: Alternative method for those without LEAD() and LAG()

    The original code WITH LEAD() and LAG() seems easier to read for me.

    The alternative method would apply in a version of SSQL which does not support LEAD or LAG. Otherwise, time to look at using the new features.

    Maybe someone who is an old hand at this could compare the actual execution plans to see which method scales better ?

  • I have encountered this precise scenario a dozen times.

    You can do the same thing with a correlated sub-query, and they usually scale better than any CTE because a CTE is just an in-line view, which SQL Server will generate first.

    In other words, SS will create ALL THE ROWS for a CTE before it processes the main query, whether it will need all those rows or not, and you can't put an index on a CTE.

    Slow and disk intensive. Bleh! Better to create 2 temp tables and index them: 1 for the current order and 1 for the previous order.

    Then join. Last time I did that the stored proc went from 2 hours to 20 minutes.

  • davidawest00 (2/17/2016)


    I have encountered this precise scenario a dozen times.

    You can do the same thing with a correlated sub-query, and they usually scale better than any CTE because a CTE is just an in-line view, which SQL Server will generate first.

    In other words, SS will create ALL THE ROWS for a CTE before it processes the main query, whether it will need all those rows or not, and you can't put an index on a CTE.

    Slow and disk intensive. Bleh! Better to create 2 temp tables and index them: 1 for the current order and 1 for the previous order.

    Then join. Last time I did that the stored proc went from 2 hours to 20 minutes.

    Good explanation.

    Just out of curiousity - how many records took 20 minutes ?

  • It was to get the latest + previous purchase per customer...about 40 million out of 60-80 million rows...twice.

    Once for current and once for previous.

  • These articles with real life examples are great to help people get rid of cursors and change the mentalities.

    I just wanted to make 2 comments:

    For sample data, create a table (even a temp table) to prevent repeating the creation of sample data each time to concentrate in the real solution. This will also help define the correct data types.

    CREATE TABLE MYTABLE(

    ID int,

    ORDER_ID char(6),

    CODE char(2),

    ORDER_DATE date

    );

    INSERT INTO MYTABLE

    VALUES

    (1, 'ORDER1', 'DS', '20151001'),

    (2, 'ORDER2', 'DS', '20151001'),

    (5, 'ORDER2', 'DS', '20151002'),

    (3, 'ORDER3', 'DS', '20151001'),

    (6, 'ORDER3', 'AG', '20151002'),

    (8, 'ORDER3', 'AG', '20151003'),

    (4, 'ORDER4', 'DS', '20151001'),

    (7, 'ORDER4', 'AG', '20151002'),

    (9, 'ORDER4', 'DS', '20151003');

    Second, I found that there might be an alternative that can give a simpler plan. I'm not sure if that means better performance, but it usually helps. The secret is to reuse your windows as much as possible.

    WITH cteGET_MAX_DATE_AND_CODE

    AS (

    SELECT

    H1.ID,

    H1.ORDER_ID,

    H1.CODE,

    H1.ORDER_DATE,

    ROW_NUMBER() --Identify the latest row for each ORDER

    OVER (PARTITION BY ORDER_ID ORDER BY ORDER_DATE DESC) AS ROW_NUM,

    LEAD(H1.CODE,1) --Get the code 1 row before the current row.

    OVER (PARTITION BY ORDER_ID

    ORDER BY ORDER_DATE DESC) AS PREVIOUS_CODE

    FROM MYTABLE H1

    )

    SELECT

    ID, ORDER_ID, CODE, PREVIOUS_CODE, ORDER_DATE

    FROM cteGET_MAX_DATE_AND_CODE

    WHERE

    ROW_NUM = 1

    AND CODE != PREVIOUS_CODE

    AND PREVIOUS_CODE IS NOT NULL;

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Agreed, always look to use new features.

    We have a couple of old 2008R2 servers and are sometimes forced to go old school on a few queries, hence my post.

    I'd be interested in the analysis of the execution plans if someone wants to step up?

  • The execution plans do not tell the whole story. I have seen some cases by Gila Monster where the execution was simplified but the load on the server was not improved (roughly same execution time.

Viewing 15 posts - 1 through 15 (of 30 total)

You must be logged in to reply to this topic. Login to reply