The Cascading (CROSS) APPLY

  • Hi Amy

    This runs about 4 times faster than the original against the sample data set.

    SELECT

    w.ID,

    Wage = MIN(w.Wage),

    Sector = MIN(w.Sector),

    tw = MIN(a.tw)

    FROM #WageData w

    INNER JOIN (

    SELECT

    ID,

    mw = MAX(wage),

    tw = SUM(wage)

    FROM #WageData wi

    GROUP BY ID

    ) a

    ON a.ID = w.ID AND a.mw = w.Wage

    GROUP BY w.ID

    How does it perform against the actual data? Can you post the actual execution plan please?

    Cheers

    ChrisM

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (6/17/2013)


    Hi Amy

    This runs about 4 times faster than the original against the sample data set.

    SELECT

    w.ID,

    Wage = MIN(w.Wage),

    Sector = MIN(w.Sector),

    tw = MIN(a.tw)

    FROM #WageData w

    INNER JOIN (

    SELECT

    ID,

    mw = MAX(wage),

    tw = SUM(wage)

    FROM #WageData wi

    GROUP BY ID

    ) a

    ON a.ID = w.ID AND a.mw = w.Wage

    GROUP BY w.ID

    How does it perform against the actual data? Can you post the actual execution plan please?

    Cheers

    ChrisM

    This seems to a bit simpler and works with the sample data:

    SELECT ID, Wage=MAX(Wage), Sector=MIN(Sector), tw=SUM(Wage)

    FROM #WageData

    GROUP BY ID

    Did I miss something?


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (6/17/2013)


    ...

    This seems to a bit simpler and works with the sample data:

    SELECT ID, Wage=MAX(Wage), Sector=MIN(Sector), tw=SUM(Wage)

    FROM #WageData

    GROUP BY ID

    Did I miss something?

    Well I'll be darned...nice one, Dwain. Must have got sidetracked.

    Interestingly, it's only a few percent faster than my overcomplex effort. Instead of reading the table twice, which you would expect looking at the two table references, SQL Server replays a table spool.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • dwain.c (6/17/2013)


    ChrisM@Work (6/17/2013)


    Hi Amy

    This runs about 4 times faster than the original against the sample data set.

    SELECT

    w.ID,

    Wage = MIN(w.Wage),

    Sector = MIN(w.Sector),

    tw = MIN(a.tw)

    FROM #WageData w

    INNER JOIN (

    SELECT

    ID,

    mw = MAX(wage),

    tw = SUM(wage)

    FROM #WageData wi

    GROUP BY ID

    ) a

    ON a.ID = w.ID AND a.mw = w.Wage

    GROUP BY w.ID

    How does it perform against the actual data? Can you post the actual execution plan please?

    Cheers

    ChrisM

    This seems to a bit simpler and works with the sample data:

    SELECT ID, Wage=MAX(Wage), Sector=MIN(Sector), tw=SUM(Wage)

    FROM #WageData

    GROUP BY ID

    Did I miss something?

    Yep. MIN(Sector) breaks the spec 😉 We want the sector corresponding to the highest wage. MIN(Sector) is apparently ok to break a tie on MAX(Wage).

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (6/18/2013)


    dwain.c (6/17/2013)


    ChrisM@Work (6/17/2013)


    Hi Amy

    This runs about 4 times faster than the original against the sample data set.

    SELECT

    w.ID,

    Wage = MIN(w.Wage),

    Sector = MIN(w.Sector),

    tw = MIN(a.tw)

    FROM #WageData w

    INNER JOIN (

    SELECT

    ID,

    mw = MAX(wage),

    tw = SUM(wage)

    FROM #WageData wi

    GROUP BY ID

    ) a

    ON a.ID = w.ID AND a.mw = w.Wage

    GROUP BY w.ID

    How does it perform against the actual data? Can you post the actual execution plan please?

    Cheers

    ChrisM

    This seems to a bit simpler and works with the sample data:

    SELECT ID, Wage=MAX(Wage), Sector=MIN(Sector), tw=SUM(Wage)

    FROM #WageData

    GROUP BY ID

    Did I miss something?

    Yep. MIN(Sector) breaks the spec 😉 We want the sector corresponding to the highest wage. MIN(Sector) is apparently ok to break a tie on MAX(Wage).

    I thought it might. Actually I was working on another effort before I hit on and posted the above which, unfortunately I didn't save. I'll try to reproduce it and see if it stands a chance.

    Stay tuned...


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • This was what I was working on before I had the brilliantly silly idea that my previous submission would be functionally correct:

    SELECT ID, Wage, Sector, tw

    FROM (

    SELECT ID, Wage, Sector, tw=SUM(Wage) OVER (PARTITION BY ID)

    ,rn=ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Wage DESC, Sector)

    FROM #WageData) a

    WHERE rn=1

    Probably won't be faster than Chris's because of the sort in the execution plan but just thought I'd throw it out there.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Thank you again for helping me find a solution, Chris. I ran your new query against the older one, and the execution plans using the source data are attached.

    (I hope that's what you meant by sending on the execution plans. )

    But before delving too heavily into which one is faster, I can't use the new query as an indexable view any more than the original. Which is why the cascading CROSS APPLY looked to appealing. I struggle with when to give up and declare what I'm trying to do is impossible, especially since I can't claim to be a SQL expert. But I can keep using the CROSS APPLY method when the sector is not necessary, so all is good. Thanks again.

    Amy

  • How are you consuming the output, Amy? At almost 3M rows I'm guessing it's part of another query, in which case I'd look first at bundling up the faster query as an inline table-valued function, because it's very quick to write and test. If you can provide a couple of examples of usage, we'll get a better idea of how best to help you. There are alternatives.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • There is so much good stuff here. Thanks.

Viewing 9 posts - 31 through 38 (of 38 total)

You must be logged in to reply to this topic. Login to reply