• Can you at least explain why you are so sure that this subquery is the main performance problem?

    You haven't given us a lot to go on, and any way we write this it still has to get all the rows of B that match your rows in A, and sum them by the foreign key. I would guess that any performance issue is far more likely to be about indexes and/or statistics then have this vs an apply vs a windowing function.

    Also, I really don't think it is the place for a windowing function. Windowing is useful when you want to apply an aggregation to a column value while keeping the overall level of granularity the same. Here you want a table A level of granularity, with an aggregation of table B. If you did a windowed sum of table B, you'd still need to group the data or do a top before joining to table A).

    The only other rewrite to try would be something like a CTE or derived table that groups and sums table B, and then joining to that in the main query (instead of correlating).