Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


CTE - UNION ALL - GROUP BY


CTE - UNION ALL - GROUP BY

Author
Message
Arjun Sivadasan
Arjun Sivadasan
SSC-Addicted
SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)

Group: General Forum Members
Points: 448 Visits: 976
Today I got a mail from one of our developers. It is a little complicated to explain and I cannot post the DDL. So, let me post a piece of the code ('cleaned up') and explain what was going wrong.

CTE definition 'cleaned up'

;with t2 as 
(
select
tb.product,
r.category,
p.date,
sum(p.value) as value,
ROW_NUMBER() over (partition by tb.product, r.category order by SUM(p.value)) as cnt
FROM
--tables
)



select t3.product, t3.category, SUM(t3.value) * -1  as Value 
from
(
select t2.product, t2.category, t2.Value * @Ratio1 as value from t2 where cnt = @Count1
union all
select t2.product, t2.category, t2.Value * @Ratio2 as value from t2 where cnt = @Count2
) as t3
--where t3.category = 'ABC'
group by t3.product, t3.category
order by t3.product



Here t2 is a CTE which does an aggregation (SUM) and a row_number() generation. When the developer ran this query, he was getting incorrect output for column Value. When he added the filter by category, he got the correct output.

My analysis:
I removed the GROUP BY and ran the query, got the correct values for the column value (say Val1 and Val2).
When I added the GROUP BY, the output was wrong, i.e. SUM(t3.value) was not showing
Val1 + Val2.

I got a hunch that this had something to do with UNION ALL on CTE, so I modified the query as below to eliminate the UNION ALL op.

select t3.product, t3.category, SUM(t3.value) * -1  as Value 
from
(
select t2.product, t2.category,
CASE
WHEN cnt = @Count1 THEN t2.value * @Ratio1
WHEN cnt = @Count2 THEN t2.value * @Ratio2
END as value
from t2 where cnt = @Count1 OR cnt = @Count2
) as t3
--where t3.category = 'ABC'
group by t3.product, t3.category
order by t3.product



This worked just fine, gave the expected output. What I would like to know is what could have caused a wrong aggregation when the query had a UNION ALL. I know the CTE gets evaluated twice but the CTE is deterministic. Any help here would be really great. I will post the CTE definition below.
Arjun Sivadasan
Arjun Sivadasan
SSC-Addicted
SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)

Group: General Forum Members
Points: 448 Visits: 976
--removed duplicate post
ChrisM@Work
ChrisM@Work
SSCrazy Eights
SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)SSCrazy Eights (9K reputation)

Group: General Forum Members
Points: 8970 Visits: 19021
This is an interesting one Arjun! Can you post the actual plan please?

“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Arjun Sivadasan
Arjun Sivadasan
SSC-Addicted
SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)

Group: General Forum Members
Points: 448 Visits: 976
Yes, very interesting. I will need more time to prepare the sample tables. I am not supposed to post the original plan, can get fired for that.
Arjun Sivadasan
Arjun Sivadasan
SSC-Addicted
SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)

Group: General Forum Members
Points: 448 Visits: 976
I tried to reproduce the problem using the following query, which does pretty much the same thing (warning: the query will not make any logical sense). However, I could not reproduce it. I'm still able to reproduce the problem on our development server, using the original query.

An interesting observation is the original query worked fine when I specified MAXDOP as 1 in the final select.

;WITH cte
AS
(
SELECT
o.name [tablename],
SUM(ac.max_length) [ColLength],
ROW_NUMBER() OVER (PARTITION BY o.name ORDER BY SUM(ac.max_length)) AS cnt

FROM
sys.objects o
INNER JOIN sys.COLUMNS c
ON c.OBJECT_ID = o.object_id
INNER JOIN sys.all_columns ac
ON ac.NAME = c.NAME AND ac.OBJECT_ID = c.OBJECT_ID

GROUP BY
o.name,
ac.max_length
)

SELECT
tablename, SUM(ColLength) * -1 [ColLength]
FROM
(
SELECT tablename, Collength * 2 [ColLength] FROM cte WHERE cnt = 1
UNION ALL
SELECT tablename, ColLength * 4 [ColLength] FROM cte WHERE cnt = 2
) t
GROUP BY tablename


Arjun Sivadasan
Arjun Sivadasan
SSC-Addicted
SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)

Group: General Forum Members
Points: 448 Visits: 976
Update on this:

I and my teammate have concluded that this a SQL bug - not easily reproducible. The initial query, however, always produces wrong result. We ran the same query with MAXDOP(1) and higher values. When MAXDOP(1) was specified, we got correct results. However, for each other number specified, we got a different value.

There is a similar problem which was reported by a use to MS, however, in that case, the CTE definition was non-deterministic as the ORDER BY in his ROW_NUMBER() was on duplicate values. We do plan to report this to MS, once we find time to produce sample data. Cheers!
Lynn Pettis
Lynn Pettis
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24200 Visits: 37964
Are the tables partitioned in any way?

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Arjun Sivadasan
Arjun Sivadasan
SSC-Addicted
SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)

Group: General Forum Members
Points: 448 Visits: 976
I think yes; I'm quite new to this project, so, I will have to check before I confirm. I will update tomorrow.
Arjun Sivadasan
Arjun Sivadasan
SSC-Addicted
SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)SSC-Addicted (448 reputation)

Group: General Forum Members
Points: 448 Visits: 976
I checked this up and yes, the tables which are used in the CTE definition (very first query in my first post) are partitioned. The partition is on a column BatchID. This column appears in the ON clause of a JOIN in the CTE.
Lynn Pettis
Lynn Pettis
SSC-Insane
SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)SSC-Insane (24K reputation)

Group: General Forum Members
Points: 24200 Visits: 37964
Can you reproduce the error by setting up tables that look similar in structure and data type and recreate the issue?

Cool
Lynn Pettis

For better assistance in answering your questions, click here
For tips to get better help with Performance Problems, click here
For Running Totals and its variations, click here or when working with partitioned tables
For more about Tally Tables, click here
For more about Cross Tabs and Pivots, click here and here
Managing Transaction Logs

SQL Musings from the Desert Fountain Valley SQL (My Mirror Blog)
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search