The Cascading (CROSS) APPLY

Question

Post reply

The Cascading (CROSS) APPLY

Viewing 9 posts - 31 through 39 (of 39 total)

You must be logged in to reply to this topic. Login to reply

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 1

Hi Amy

This runs about 4 times faster than the original against the sample data set.

SELECT

w.ID,

Wage = MIN(w.Wage),

Sector = MIN(w.Sector),

tw = MIN(a.tw)

FROM #WageData w

INNER JOIN (

SELECT

ID,

mw = MAX(wage),

tw = SUM(wage)

FROM #WageData wi

GROUP BY ID

) a

ON a.ID = w.ID AND a.mw = w.Wage

GROUP BY w.ID

How does it perform against the actual data? Can you post the actual execution plan please?

Cheers

ChrisM

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Dwain Camps SSC Guru Points: 86978 More actions · Answer 2

ChrisM@Work (6/17/2013)
Hi Amy
This runs about 4 times faster than the original against the sample data set.
SELECT
w.ID,
Wage = MIN(w.Wage),
Sector = MIN(w.Sector),
tw = MIN(a.tw)
FROM #WageData w
INNER JOIN (
SELECT
ID,
mw = MAX(wage),
tw = SUM(wage)
FROM #WageData wi
GROUP BY ID
) a
ON a.ID = w.ID AND a.mw = w.Wage
GROUP BY w.ID
How does it perform against the actual data? Can you post the actual execution plan please?
Cheers
ChrisM

This seems to a bit simpler and works with the sample data:

SELECT ID, Wage=MAX(Wage), Sector=MIN(Sector), tw=SUM(Wage)

FROM #WageData

GROUP BY ID

Did I miss something?

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

dwain.c (6/17/2013)

...
This seems to a bit simpler and works with the sample data:
SELECT ID, Wage=MAX(Wage), Sector=MIN(Sector), tw=SUM(Wage)
FROM #WageData
GROUP BY ID
Did I miss something?

Well I'll be darned...nice one, Dwain. Must have got sidetracked.

Interestingly, it's only a few percent faster than my overcomplex effort. Instead of reading the table twice, which you would expect looking at the two table references, SQL Server replays a table spool.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 4

dwain.c (6/17/2013)
ChrisM@Work (6/17/2013)
Hi Amy
This runs about 4 times faster than the original against the sample data set.
SELECT
w.ID,
Wage = MIN(w.Wage),
Sector = MIN(w.Sector),
tw = MIN(a.tw)
FROM #WageData w
INNER JOIN (
SELECT
ID,
mw = MAX(wage),
tw = SUM(wage)
FROM #WageData wi
GROUP BY ID
) a
ON a.ID = w.ID AND a.mw = w.Wage
GROUP BY w.ID
How does it perform against the actual data? Can you post the actual execution plan please?
Cheers
ChrisM
This seems to a bit simpler and works with the sample data:
SELECT ID, Wage=MAX(Wage), Sector=MIN(Sector), tw=SUM(Wage)
FROM #WageData
GROUP BY ID
Did I miss something?

Yep. MIN(Sector) breaks the spec 😉 We want the sector corresponding to the highest wage. MIN(Sector) is apparently ok to break a tie on MAX(Wage).

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Dwain Camps SSC Guru Points: 86978 More actions · Answer 5

ChrisM@Work (6/18/2013)
dwain.c (6/17/2013)
ChrisM@Work (6/17/2013)
Hi Amy
This runs about 4 times faster than the original against the sample data set.
SELECT
w.ID,
Wage = MIN(w.Wage),
Sector = MIN(w.Sector),
tw = MIN(a.tw)
FROM #WageData w
INNER JOIN (
SELECT
ID,
mw = MAX(wage),
tw = SUM(wage)
FROM #WageData wi
GROUP BY ID
) a
ON a.ID = w.ID AND a.mw = w.Wage
GROUP BY w.ID
How does it perform against the actual data? Can you post the actual execution plan please?
Cheers
ChrisM
This seems to a bit simpler and works with the sample data:
SELECT ID, Wage=MAX(Wage), Sector=MIN(Sector), tw=SUM(Wage)
FROM #WageData
GROUP BY ID
Did I miss something?
Yep. MIN(Sector) breaks the spec 😉 We want the sector corresponding to the highest wage. MIN(Sector) is apparently ok to break a tie on MAX(Wage).

I thought it might. Actually I was working on another effort before I hit on and posted the above which, unfortunately I didn't save. I'll try to reproduce it and see if it stands a chance.

Stay tuned...

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Dwain Camps SSC Guru Points: 86978 More actions · Answer 6

This was what I was working on before I had the brilliantly silly idea that my previous submission would be functionally correct:

SELECT ID, Wage, Sector, tw

FROM (

SELECT ID, Wage, Sector, tw=SUM(Wage) OVER (PARTITION BY ID)

,rn=ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Wage DESC, Sector)

FROM #WageData) a

WHERE rn=1

Probably won't be faster than Chris's because of the sort in the execution plan but just thought I'd throw it out there.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Amy.G SSCrazy Points: 2305 More actions · Answer 7

Thank you again for helping me find a solution, Chris. I ran your new query against the older one, and the execution plans using the source data are attached.

(I hope that's what you meant by sending on the execution plans. )

But before delving too heavily into which one is faster, I can't use the new query as an indexable view any more than the original. Which is why the cascading CROSS APPLY looked to appealing. I struggle with when to give up and declare what I'm trying to do is impossible, especially since I can't claim to be a SQL expert. But I can keep using the CROSS APPLY method when the sector is not necessary, so all is good. Thanks again.

Amy

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

How are you consuming the output, Amy? At almost 3M rows I'm guessing it's part of another query, in which case I'd look first at bundling up the faster query as an inline table-valued function, because it's very quick to write and test. If you can provide a couple of examples of usage, we'll get a better idea of how best to help you. There are alternatives.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

akljfhnlaflkj SSC Guru Points: 76202 More actions · Answer 9

akljfhnlaflkj

SSC Guru

Points: 76202

May 27, 2015 at 8:30 am

#1800956

There is so much good stuff here. Thanks.