SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Stairway to T-SQL Part 2: Beyond T-SQL Basics: Level 3: Building a Correlated Subquery


Stairway to T-SQL Part 2: Beyond T-SQL Basics: Level 3: Building a Correlated Subquery

Author
Message
Greg Larsen
Greg Larsen
SSCertifiable
SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)SSCertifiable (6.8K reputation)

Group: General Forum Members
Points: 6847 Visits: 290
Comments posted to this topic are about the item Stairway to T-SQL Part 2: Beyond T-SQL Basics: Level 3: Building a Correlated Subquery

Gregory A. Larsen, MVP
Eirikur Eiriksson
Eirikur Eiriksson
SSC Guru
SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)

Group: General Forum Members
Points: 63368 Visits: 19819
Thank you Gregory for this clear and concise article.

I cannot resist pitching in some Window function version of the queries;


;WITH CUST_SALES_DETAILS AS
(
SELECT
SSOH.CustomerID
,ROW_NUMBER() OVER
(PARTITION BY SSOH.SalesOrderID ORDER BY (SELECT NULL)) AS SSOH_RID
,COUNT(SSOD.SalesOrderDetailID) OVER
(PARTITION BY SSOH.SalesOrderID) AS COUNT_SSOD
FROM Sales.SalesOrderHeader SSOH
INNER JOIN Sales.SalesOrderDetail SSOD
ON SSOH.SalesOrderID = SSOD.SalesOrderID
)
SELECT
CSD.CustomerID
,CSD.COUNT_SSOD
/* ,DENSE_RANK() OVER (ORDER BY CSD.COUNT_SSOD DESC) AS SALES_RANK */
FROM CUST_SALES_DETAILS CSD
WHERE CSD.SSOH_RID = 1
AND CSD.COUNT_SSOD > 70;

GO
-------------------------------------------

;WITH CUST_SALE_BY_YEAR AS
(
SELECT
SSOH.CustomerID
,ROW_NUMBER() OVER
(PARTITION BY SSOH.CustomerID,YEAR(SSOH.[OrderDate])
ORDER BY (SELECT NULL)) AS CUST_RID
,YEAR(SSOH.[OrderDate]) AS SSOH_YEAR
,SUM(SSOH.SubTotal) OVER
(PARTITION BY SSOH.CustomerID,YEAR(SSOH.[OrderDate])) AS SSOH_SUBTOTAL
,(SUM(SSOH.SubTotal) OVER
(PARTITION BY SSOH.CustomerID,YEAR(SSOH.[OrderDate]))) * (0.10) AS SSOH_REBATE
FROM Sales.SalesOrderHeader SSOH
)
SELECT
CABY.CustomerID
/*
,DENSE_RANK() OVER (PARTITION BY CABY.SSOH_YEAR ORDER BY CABY.SSOH_SUBTOTAL DESC,CABY.SSOH_YEAR DESC) AS SALES_RANK
*/
,CABY.SSOH_YEAR
,CABY.SSOH_SUBTOTAL
,CABY.SSOH_REBATE
FROM CUST_SALE_BY_YEAR CABY
WHERE CABY.CUST_RID = 1
AND CABY.SSOH_YEAR = 2008
AND CABY.SSOH_SUBTOTAL > 150000;

GO
-------------------------------------------


NerdMan
NerdMan
SSC-Enthusiastic
SSC-Enthusiastic (125 reputation)SSC-Enthusiastic (125 reputation)SSC-Enthusiastic (125 reputation)SSC-Enthusiastic (125 reputation)SSC-Enthusiastic (125 reputation)SSC-Enthusiastic (125 reputation)SSC-Enthusiastic (125 reputation)SSC-Enthusiastic (125 reputation)

Group: General Forum Members
Points: 125 Visits: 153
Thnks for the great article. I have just one remark :
For Question 1 the answer is C and not A. as the answer A refers to inner query instead of outer query.
Dennis Wagner-347763
Dennis Wagner-347763
SSC-Addicted
SSC-Addicted (404 reputation)SSC-Addicted (404 reputation)SSC-Addicted (404 reputation)SSC-Addicted (404 reputation)SSC-Addicted (404 reputation)SSC-Addicted (404 reputation)SSC-Addicted (404 reputation)SSC-Addicted (404 reputation)

Group: General Forum Members
Points: 404 Visits: 229
The correlated subquery may be executed many times. It will be run once for each candidate row selected in the outer query.


I believe that Jeff Moden refers to this as RBAR (row by agonizing row).

A nice followup article would be "How to avoid using the Correlated Subquery."
Eirikur Eiriksson
Eirikur Eiriksson
SSC Guru
SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)SSC Guru (63K reputation)

Group: General Forum Members
Points: 63368 Visits: 19819
Just to point out a possible improvement on the first query in part 3.
The sql engine has to match all rows from the subquery to the outer query. This is because the filtering is outside the subquery.

SELECT CustomerID FROM Sales.SalesOrderHeader OH
WHERE (SELECT COUNT(*) FROM Sales.SalesOrderDetail
WHERE SalesOrderID = OH.SalesOrderID) > 70;



Just by adding a GROUP BY clause, the work is cut by a 1/4, here are two suggestions, the latter slightly faster Cool;

SELECT CustomerID FROM Sales.SalesOrderHeader OH
WHERE (SELECT COUNT(*) FROM Sales.SalesOrderDetail
WHERE SalesOrderID = OH.SalesOrderID
GROUP BY SalesOrderID) > 70;



And

SELECT
SOH.CustomerID
FROM
(
SELECT
COUNT(*) AS SOD_COUNT
,SOD.SalesOrderID
FROM Sales.SalesOrderDetail SOD
GROUP BY SOD.SalesOrderID
) AS X
INNER JOIN Sales.SalesOrderHeader SOH
ON X.SalesOrderID = SOH.SalesOrderID
WHERE X.SOD_COUNT > 70;


GilaMonster
GilaMonster
SSC Guru
SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)

Group: General Forum Members
Points: 370215 Visits: 46952
Dennis Wagner-347763 (3/6/2014)
The correlated subquery may be executed many times. It will be run once for each candidate row selected in the outer query.


I believe that Jeff Moden refers to this as RBAR (row by agonizing row).


Except that correlated subqueries *don't* run once per row of the outer query. It's an old and persistent myth that they do that.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


jetboy2k
jetboy2k
SSC-Enthusiastic
SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)

Group: General Forum Members
Points: 194 Visits: 69
I understand the logic behind listing #3, but I was surprised to see, and uncertain on how it works, that it is not necessary to use another GROUP BY clause in the subquery:

SELECT Outer_H.[CustomerID]
, SUM(Outer_H.[SubTotal]) AS TotalPurchase
, SUM(Outer_H.[SubTotal]) * .10 AS Rebate
FROM [Sales].[SalesOrderHeader] AS Outer_H
WHERE YEAR(Outer_H.[OrderDate]) = '2008'
GROUP BY Outer_H.[CustomerID]
HAVING (SELECT SUM(Inner_H.[SubTotal]) FROM [Sales].[SalesOrderHeader] AS Inner_H
WHERE Inner_H.[CustomerID] = Outer_H.[CustomerID]
AND YEAR(Inner_H.[OrderDate]) = '2008' GROUP BY CLAUSE NOT REQUIRED HERE) > 150000
ORDER BY Rebate DESC;

How is SQL able to execute the aggregate SUM() function in the inner query w/out a GROUP BY clause applied? Does it automatically apply the GROUP BY clause from the outer query to the inner query?

Thanks.
GilaMonster
GilaMonster
SSC Guru
SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)SSC Guru (370K reputation)

Group: General Forum Members
Points: 370215 Visits: 46952
Because there's no un-aggregated columns being returned from that subquery. Group By is only needed when you have columns in aggregate and columns that aren't in aggregate in the select clause.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


jetboy2k
jetboy2k
SSC-Enthusiastic
SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)SSC-Enthusiastic (194 reputation)

Group: General Forum Members
Points: 194 Visits: 69
GilaMonster (9/6/2016)
Because there's no un-aggregated columns being returned from that subquery. Group By is only needed when you have columns in aggregate and columns that aren't in aggregate in the select clause.



Gila, thank you very much. If I learn nothing else from this stairway, I've learned this. So thanks again.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search