Correlated query to INNER JOIN or Window function

Question

Correlated query to INNER JOIN or Window function

sql-lover

SSCoach

Points: 18530
More actions
March 31, 2015 at 2:18 pm

#296526

I'm having some performance issues with a TSQL query. It's a complex statement but the main issue is the correlated query.
How can I convert this
SELECT TOP 5
(SELECT SUM(lt2.col3)
FROM dbo.MyTable2 lt2
WHERElt2.col1 = lt.col1 AND lt2.col2 = lt.col2 AND lt2.id = lt.id ) AS Result
FROM dbo.MyTable1 t1
... to an inner join or a sql2012 window function?
By the way, I just added the TOP 5 myself while testing. It's not in the main query.

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply

pietlinden SSC Guru Points: 63428 More actions · Answer 1

What are you trying to do? (Explain the business part, not the math.)

Can you give a really simple example? Do you have a table you're trying to join the TOP values query to? You do that by using CROSS APPLY.

sql-lover SSCoach Points: 18530 More actions · Answer 2

I am trying to eliminate the subselect.

Anyway, I was able to convert it to an INNER join, but it actually performs slower.

Luis Cazares SSC Guru Points: 183698 More actions · Answer 3

As previously said, you can use a cross apply.

I'm leaving both options. Note that the join needs to calculate all the values before joining and that's why it can be slow.

SELECT TOP 5

x.Result

FROM dbo.MyTable1 t1

CROSS APPLY (SELECT SUM(lt2.col3) AS Result

FROM dbo.MyTable2 lt2

WHERElt2.col1 = lt.col1 AND lt2.col2 = lt.col2 AND lt2.id = lt.id ) x;

SELECT TOP 5

x.Result

FROM dbo.MyTable1 t1

JOIN (SELECT lt2.col1, lt2.col2, lt2.id,

SUM(lt2.col3) AS Result

FROM dbo.MyTable2 lt2

GROUP BY lt2.col1, lt2.col2, lt2.id ) x ON x.col1 = lt.col1 AND x.col2 = lt.col2 AND x.id = lt.id;

Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

How to post data/code on a forum to get the best help: Option 1 / Option 2

Sergiy SSC Guru Points: 110206 More actions · Answer 4

SELECT TOP 5

lt1.col1, lt1.col2, lt1.id, SUM(lt2.col3) AS Result

FROM dbo.MyTable1 lt1

LEFT JOIN dbo.MyTable2 lt2 ON lt2.col1 = lt1.col1 AND lt2.col2 = lt1.col2 AND lt2.id = lt1.id

GROUP BY lt1.col1, lt1.col2, lt1.id

ORDER BY Result DESC

_____________
Code for TallyGenerator

sql-lover SSCoach Points: 18530 More actions · Answer 5

Luis Cazares (3/31/2015)
As previously said, you can use a cross apply.
I'm leaving both options. Note that the join needs to calculate all the values before joining and that's why it can be slow.
SELECT TOP 5
x.Result
FROM dbo.MyTable1 t1
CROSS APPLY (SELECT SUM(lt2.col3) AS Result
FROM dbo.MyTable2 lt2
WHERElt2.col1 = lt.col1 AND lt2.col2 = lt.col2 AND lt2.id = lt.id ) x;
SELECT TOP 5
x.Result
FROM dbo.MyTable1 t1
JOIN (SELECT lt2.col1, lt2.col2, lt2.id,
SUM(lt2.col3) AS Result
FROM dbo.MyTable2 lt2
GROUP BY lt2.col1, lt2.col2, lt2.id ) x ON x.col1 = lt.col1 AND x.col2 = lt.col2 AND x.id = lt.id;

True, forgot that, just saw it via SQL Sentry. Thanks.

Tried as a cross apply too, and it basically performs the same. Now I wonder if , on my particular case, a window function may perform better. But not very good with window functions. I will try though.

I'm afraid I will have to take a look on the current Indexes, but I was trying to improve the TSQL as a 1st option.

Dwain Camps SSC Guru Points: 86908 More actions · Answer 6

sql-lover (3/31/2015)
Luis Cazares (3/31/2015)
As previously said, you can use a cross apply.
I'm leaving both options. Note that the join needs to calculate all the values before joining and that's why it can be slow.
SELECT TOP 5
x.Result
FROM dbo.MyTable1 t1
CROSS APPLY (SELECT SUM(lt2.col3) AS Result
FROM dbo.MyTable2 lt2
WHERElt2.col1 = lt.col1 AND lt2.col2 = lt.col2 AND lt2.id = lt.id ) x;
SELECT TOP 5
x.Result
FROM dbo.MyTable1 t1
JOIN (SELECT lt2.col1, lt2.col2, lt2.id,
SUM(lt2.col3) AS Result
FROM dbo.MyTable2 lt2
GROUP BY lt2.col1, lt2.col2, lt2.id ) x ON x.col1 = lt.col1 AND x.col2 = lt.col2 AND x.id = lt.id;
True, forgot that, just saw it via SQL Sentry. Thanks.
Tried as a cross apply too, and it basically performs the same. Now I wonder if , on my particular case, a window function may perform better. But not very good with window functions. I will try though.
I'm afraid I will have to take a look on the current Indexes, but I was trying to improve the TSQL as a 1st option.

Shouldn't you have an ORDER BY on the outer query to control which of the TOP 5 records are returned?

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

sql-lover SSCoach Points: 18530 More actions · Answer 7

There is an ORDER BY , but I did not post it. But the TOP and ORDER BY are there just for testing. They actually do not exist in the original query. I put it there to reduce the original set.

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 8

sql-lover (3/31/2015)
I'm having some performance issues with a TSQL query. It's a complex statement but the main issue is the correlated query.

Post the full query, the table structure (DDL), consumable sample data and the actual execution plan, otherwise there is little one can do here.

😎

sql-lover SSCoach Points: 18530 More actions · Answer 9

Eirikur Eiriksson (3/31/2015)
sql-lover (3/31/2015)
I'm having some performance issues with a TSQL query. It's a complex statement but the main issue is the correlated query.
Post the full query, the table structure (DDL), consumable sample data and the actual execution plan, otherwise there is little one can do here.
😎

I can't, due privacy reasons. But I will try to obfuscate the execution plans and upload.

Nevyn SSChampion Points: 14210 More actions · Answer 10

Can you at least explain why you are so sure that this subquery is the main performance problem?

You haven't given us a lot to go on, and any way we write this it still has to get all the rows of B that match your rows in A, and sum them by the foreign key. I would guess that any performance issue is far more likely to be about indexes and/or statistics then have this vs an apply vs a windowing function.

Also, I really don't think it is the place for a windowing function. Windowing is useful when you want to apply an aggregation to a column value while keeping the overall level of granularity the same. Here you want a table A level of granularity, with an aggregation of table B. If you did a windowed sum of table B, you'd still need to group the data or do a top before joining to table A).

The only other rewrite to try would be something like a CTE or derived table that groups and sums table B, and then joining to that in the main query (instead of correlating).

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 11

You can use the free version of SQL Sentry Plan Explorer to obfuscate it.

😎

sql-lover SSCoach Points: 18530 More actions · Answer 12

I fixed it. Even though I wrote the query in 3 different ways the performance did not improve, in fact, it got worse. I always try to fix performance issues looking for better ways to write the same TSQL query. We do have an "over index" problem at work which is really, really bad, and I'm still fixing (happened before I came on board) and I try to stay away of new Indexes if possible.

Having said that, this particular issue was about a missing Index. In fact, it seems that the all current Indexes need to be changed. The CI is not the best choice.

As an immediate workaround I may add this new Index which reduced the query time from 20/30 min to 1 sec in our Dev box. That's what I call a performance boost 😉

I will try to post the query and execution plan later if I can. It was an interesting problem.

sql-lover SSCoach Points: 18530 More actions · Answer 13

Eirikur Eiriksson (4/1/2015)
You can use the free version of SQL Sentry Plan Explorer to obfuscate it.
😎

I've been using it for several years now, it does not allow you that, by default. I think there is a powershell script that externally does it ? But I can be wrong. Happy to hear how or see a link with instructions.

I have the FREE edition by the way.

Eirikur Eiriksson SSC Guru Points: 182882 More actions · Answer 14

sql-lover (4/2/2015)
Eirikur Eiriksson (4/1/2015)
You can use the free version of SQL Sentry Plan Explorer to obfuscate it.
😎
I've been using it for several years now, it does not allow you that, by default. I think there is a powershell script that externally does it ? But I can be wrong. Happy to hear how or see a link with instructions.
I have the FREE edition by the way.

The free edition allows you to do this, Edit->Anonymize, which version are you on? (latest is 2.7)

😎