Correlated query to INNER JOIN or Window function

  • I'm having some performance issues with a TSQL query. It's a complex statement but the main issue is the correlated query.

    How can I convert this

    SELECT TOP 5

    (SELECT SUM(lt2.col3)

    FROM dbo.MyTable2 lt2

    WHERElt2.col1 = lt.col1 AND lt2.col2 = lt.col2 AND lt2.id = lt.id ) AS Result

    FROM dbo.MyTable1 t1

    ... to an inner join or a sql2012 window function?

    By the way, I just added the TOP 5 myself while testing. It's not in the main query.

  • What are you trying to do? (Explain the business part, not the math.)

    Can you give a really simple example? Do you have a table you're trying to join the TOP values query to? You do that by using CROSS APPLY.

  • I am trying to eliminate the subselect.

    Anyway, I was able to convert it to an INNER join, but it actually performs slower.

  • As previously said, you can use a cross apply.

    I'm leaving both options. Note that the join needs to calculate all the values before joining and that's why it can be slow.

    SELECT TOP 5

    x.Result

    FROM dbo.MyTable1 t1

    CROSS APPLY (SELECT SUM(lt2.col3) AS Result

    FROM dbo.MyTable2 lt2

    WHERElt2.col1 = lt.col1 AND lt2.col2 = lt.col2 AND lt2.id = lt.id ) x;

    SELECT TOP 5

    x.Result

    FROM dbo.MyTable1 t1

    JOIN (SELECT lt2.col1, lt2.col2, lt2.id,

    SUM(lt2.col3) AS Result

    FROM dbo.MyTable2 lt2

    GROUP BY lt2.col1, lt2.col2, lt2.id ) x ON x.col1 = lt.col1 AND x.col2 = lt.col2 AND x.id = lt.id;

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • SELECT TOP 5

    lt1.col1, lt1.col2, lt1.id, SUM(lt2.col3) AS Result

    FROM dbo.MyTable1 lt1

    LEFT JOIN dbo.MyTable2 lt2 ON lt2.col1 = lt1.col1 AND lt2.col2 = lt1.col2 AND lt2.id = lt1.id

    GROUP BY lt1.col1, lt1.col2, lt1.id

    ORDER BY Result DESC

    _____________
    Code for TallyGenerator

  • Luis Cazares (3/31/2015)


    As previously said, you can use a cross apply.

    I'm leaving both options. Note that the join needs to calculate all the values before joining and that's why it can be slow.

    SELECT TOP 5

    x.Result

    FROM dbo.MyTable1 t1

    CROSS APPLY (SELECT SUM(lt2.col3) AS Result

    FROM dbo.MyTable2 lt2

    WHERElt2.col1 = lt.col1 AND lt2.col2 = lt.col2 AND lt2.id = lt.id ) x;

    SELECT TOP 5

    x.Result

    FROM dbo.MyTable1 t1

    JOIN (SELECT lt2.col1, lt2.col2, lt2.id,

    SUM(lt2.col3) AS Result

    FROM dbo.MyTable2 lt2

    GROUP BY lt2.col1, lt2.col2, lt2.id ) x ON x.col1 = lt.col1 AND x.col2 = lt.col2 AND x.id = lt.id;

    True, forgot that, just saw it via SQL Sentry. Thanks.

    Tried as a cross apply too, and it basically performs the same. Now I wonder if , on my particular case, a window function may perform better. But not very good with window functions. I will try though.

    I'm afraid I will have to take a look on the current Indexes, but I was trying to improve the TSQL as a 1st option.

  • sql-lover (3/31/2015)


    Luis Cazares (3/31/2015)


    As previously said, you can use a cross apply.

    I'm leaving both options. Note that the join needs to calculate all the values before joining and that's why it can be slow.

    SELECT TOP 5

    x.Result

    FROM dbo.MyTable1 t1

    CROSS APPLY (SELECT SUM(lt2.col3) AS Result

    FROM dbo.MyTable2 lt2

    WHERElt2.col1 = lt.col1 AND lt2.col2 = lt.col2 AND lt2.id = lt.id ) x;

    SELECT TOP 5

    x.Result

    FROM dbo.MyTable1 t1

    JOIN (SELECT lt2.col1, lt2.col2, lt2.id,

    SUM(lt2.col3) AS Result

    FROM dbo.MyTable2 lt2

    GROUP BY lt2.col1, lt2.col2, lt2.id ) x ON x.col1 = lt.col1 AND x.col2 = lt.col2 AND x.id = lt.id;

    True, forgot that, just saw it via SQL Sentry. Thanks.

    Tried as a cross apply too, and it basically performs the same. Now I wonder if , on my particular case, a window function may perform better. But not very good with window functions. I will try though.

    I'm afraid I will have to take a look on the current Indexes, but I was trying to improve the TSQL as a 1st option.

    Shouldn't you have an ORDER BY on the outer query to control which of the TOP 5 records are returned?


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • There is an ORDER BY , but I did not post it. But the TOP and ORDER BY are there just for testing. They actually do not exist in the original query. I put it there to reduce the original set.

  • sql-lover (3/31/2015)


    I'm having some performance issues with a TSQL query. It's a complex statement but the main issue is the correlated query.

    Post the full query, the table structure (DDL), consumable sample data and the actual execution plan, otherwise there is little one can do here.

    😎

  • Eirikur Eiriksson (3/31/2015)


    sql-lover (3/31/2015)


    I'm having some performance issues with a TSQL query. It's a complex statement but the main issue is the correlated query.

    Post the full query, the table structure (DDL), consumable sample data and the actual execution plan, otherwise there is little one can do here.

    😎

    I can't, due privacy reasons. But I will try to obfuscate the execution plans and upload.

  • Can you at least explain why you are so sure that this subquery is the main performance problem?

    You haven't given us a lot to go on, and any way we write this it still has to get all the rows of B that match your rows in A, and sum them by the foreign key. I would guess that any performance issue is far more likely to be about indexes and/or statistics then have this vs an apply vs a windowing function.

    Also, I really don't think it is the place for a windowing function. Windowing is useful when you want to apply an aggregation to a column value while keeping the overall level of granularity the same. Here you want a table A level of granularity, with an aggregation of table B. If you did a windowed sum of table B, you'd still need to group the data or do a top before joining to table A).

    The only other rewrite to try would be something like a CTE or derived table that groups and sums table B, and then joining to that in the main query (instead of correlating).

  • You can use the free version of SQL Sentry Plan Explorer to obfuscate it.

    😎

  • I fixed it. Even though I wrote the query in 3 different ways the performance did not improve, in fact, it got worse. I always try to fix performance issues looking for better ways to write the same TSQL query. We do have an "over index" problem at work which is really, really bad, and I'm still fixing (happened before I came on board) and I try to stay away of new Indexes if possible.

    Having said that, this particular issue was about a missing Index. In fact, it seems that the all current Indexes need to be changed. The CI is not the best choice.

    As an immediate workaround I may add this new Index which reduced the query time from 20/30 min to 1 sec in our Dev box. That's what I call a performance boost 😉

    I will try to post the query and execution plan later if I can. It was an interesting problem.

  • Eirikur Eiriksson (4/1/2015)


    You can use the free version of SQL Sentry Plan Explorer to obfuscate it.

    😎

    I've been using it for several years now, it does not allow you that, by default. I think there is a powershell script that externally does it ? But I can be wrong. Happy to hear how or see a link with instructions.

    I have the FREE edition by the way.

  • sql-lover (4/2/2015)


    Eirikur Eiriksson (4/1/2015)


    You can use the free version of SQL Sentry Plan Explorer to obfuscate it.

    😎

    I've been using it for several years now, it does not allow you that, by default. I think there is a powershell script that externally does it ? But I can be wrong. Happy to hear how or see a link with instructions.

    I have the FREE edition by the way.

    The free edition allows you to do this, Edit->Anonymize, which version are you on? (latest is 2.7)

    😎

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply