Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

CTE - UNION ALL - GROUP BY Expand / Collapse
Author
Message
Posted Thursday, April 11, 2013 11:31 PM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, September 25, 2014 12:24 AM
Points: 406, Visits: 775
Today I got a mail from one of our developers. It is a little complicated to explain and I cannot post the DDL. So, let me post a piece of the code ('cleaned up') and explain what was going wrong.

CTE definition 'cleaned up'

;with t2 as 
(
select
tb.product,
r.category,
p.date,
sum(p.value) as value,
ROW_NUMBER() over (partition by tb.product, r.category order by SUM(p.value)) as cnt
FROM
--tables
)

select t3.product, t3.category, SUM(t3.value) * -1  as Value 
from
(
select t2.product, t2.category, t2.Value * @Ratio1 as value from t2 where cnt = @Count1
union all
select t2.product, t2.category, t2.Value * @Ratio2 as value from t2 where cnt = @Count2
) as t3
--where t3.category = 'ABC'
group by t3.product, t3.category
order by t3.product

Here t2 is a CTE which does an aggregation (SUM) and a row_number() generation. When the developer ran this query, he was getting incorrect output for column Value. When he added the filter by category, he got the correct output.

My analysis:
I removed the GROUP BY and ran the query, got the correct values for the column value (say Val1 and Val2).
When I added the GROUP BY, the output was wrong, i.e. SUM(t3.value) was not showing
Val1 + Val2.

I got a hunch that this had something to do with UNION ALL on CTE, so I modified the query as below to eliminate the UNION ALL op.

select t3.product, t3.category, SUM(t3.value) * -1  as Value 
from
(
select t2.product, t2.category,
CASE
WHEN cnt = @Count1 THEN t2.value * @Ratio1
WHEN cnt = @Count2 THEN t2.value * @Ratio2
END as value
from t2 where cnt = @Count1 OR cnt = @Count2
) as t3
--where t3.category = 'ABC'
group by t3.product, t3.category
order by t3.product

This worked just fine, gave the expected output. What I would like to know is what could have caused a wrong aggregation when the query had a UNION ALL. I know the CTE gets evaluated twice but the CTE is deterministic. Any help here would be really great. I will post the CTE definition below.
Post #1441578
Posted Thursday, April 11, 2013 11:33 PM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, September 25, 2014 12:24 AM
Points: 406, Visits: 775
--removed duplicate post
Post #1441580
Posted Friday, April 12, 2013 1:27 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 8:29 AM
Points: 6,781, Visits: 13,987
This is an interesting one Arjun! Can you post the actual plan please?

“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1441591
Posted Friday, April 12, 2013 1:39 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, September 25, 2014 12:24 AM
Points: 406, Visits: 775
Yes, very interesting. I will need more time to prepare the sample tables. I am not supposed to post the original plan, can get fired for that.
Post #1441594
Posted Sunday, April 14, 2013 9:53 PM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, September 25, 2014 12:24 AM
Points: 406, Visits: 775
I tried to reproduce the problem using the following query, which does pretty much the same thing (warning: the query will not make any logical sense). However, I could not reproduce it. I'm still able to reproduce the problem on our development server, using the original query.

An interesting observation is the original query worked fine when I specified MAXDOP as 1 in the final select.

;WITH cte
AS
(
SELECT
o.name [tablename],
SUM(ac.max_length) [ColLength],
ROW_NUMBER() OVER (PARTITION BY o.name ORDER BY SUM(ac.max_length)) AS cnt

FROM
sys.objects o
INNER JOIN sys.COLUMNS c
ON c.OBJECT_ID = o.object_id
INNER JOIN sys.all_columns ac
ON ac.NAME = c.NAME AND ac.OBJECT_ID = c.OBJECT_ID

GROUP BY
o.name,
ac.max_length
)

SELECT
tablename, SUM(ColLength) * -1 [ColLength]
FROM
(
SELECT tablename, Collength * 2 [ColLength] FROM cte WHERE cnt = 1
UNION ALL
SELECT tablename, ColLength * 4 [ColLength] FROM cte WHERE cnt = 2
) t
GROUP BY tablename

Post #1442160
Posted Wednesday, April 17, 2013 3:53 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, September 25, 2014 12:24 AM
Points: 406, Visits: 775
Update on this:

I and my teammate have concluded that this a SQL bug - not easily reproducible. The initial query, however, always produces wrong result. We ran the same query with MAXDOP(1) and higher values. When MAXDOP(1) was specified, we got correct results. However, for each other number specified, we got a different value.

There is a similar problem which was reported by a use to MS, however, in that case, the CTE definition was non-deterministic as the ORDER BY in his ROW_NUMBER() was on duplicate values. We do plan to report this to MS, once we find time to produce sample data. Cheers!
Post #1443167
Posted Wednesday, April 17, 2013 6:19 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 6:26 PM
Points: 20,732, Visits: 32,496
Are the tables partitioned in any way?



Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1443217
Posted Wednesday, April 17, 2013 6:22 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, September 25, 2014 12:24 AM
Points: 406, Visits: 775
I think yes; I'm quite new to this project, so, I will have to check before I confirm. I will update tomorrow.
Post #1443218
Posted Thursday, April 18, 2013 3:28 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Thursday, September 25, 2014 12:24 AM
Points: 406, Visits: 775
I checked this up and yes, the tables which are used in the CTE definition (very first query in my first post) are partitioned. The partition is on a column BatchID. This column appears in the ON clause of a JOIN in the CTE.
Post #1443700
Posted Thursday, April 18, 2013 6:26 AM


SSC-Insane

SSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-InsaneSSC-Insane

Group: General Forum Members
Last Login: Yesterday @ 6:26 PM
Points: 20,732, Visits: 32,496
Can you reproduce the error by setting up tables that look similar in structure and data type and recreate the issue?


Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Post #1443790
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse