CTE - UNION ALL - GROUP BY

  • Today I got a mail from one of our developers. It is a little complicated to explain and I cannot post the DDL. So, let me post a piece of the code ('cleaned up') and explain what was going wrong.

    CTE definition 'cleaned up'

    ;with t2 as

    (

    select

    tb.product,

    r.category,

    p.date,

    sum(p.value) as value,

    ROW_NUMBER() over (partition by tb.product, r.category order by SUM(p.value)) as cnt

    FROM

    --tables

    )

    select t3.product, t3.category, SUM(t3.value) * -1 as Value

    from

    (

    select t2.product, t2.category, t2.Value * @Ratio1 as value from t2 where cnt = @Count1

    union all

    select t2.product, t2.category, t2.Value * @Ratio2 as value from t2 where cnt = @Count2

    ) as t3

    --where t3.category = 'ABC'

    group by t3.product, t3.category

    order by t3.product

    Here t2 is a CTE which does an aggregation (SUM) and a row_number() generation. When the developer ran this query, he was getting incorrect output for column Value. When he added the filter by category, he got the correct output.

    My analysis:

    I removed the GROUP BY and ran the query, got the correct values for the column value (say Val1 and Val2).

    When I added the GROUP BY, the output was wrong, i.e. SUM(t3.value) was not showing

    Val1 + Val2.

    I got a hunch that this had something to do with UNION ALL on CTE, so I modified the query as below to eliminate the UNION ALL op.

    select t3.product, t3.category, SUM(t3.value) * -1 as Value

    from

    (

    select t2.product, t2.category,

    CASE

    WHEN cnt = @Count1 THEN t2.value * @Ratio1

    WHEN cnt = @Count2 THEN t2.value * @Ratio2

    ENDas value

    from t2 where cnt = @Count1 OR cnt = @Count2

    ) as t3

    --where t3.category = 'ABC'

    group by t3.product, t3.category

    order by t3.product

    This worked just fine, gave the expected output. What I would like to know is what could have caused a wrong aggregation when the query had a UNION ALL. I know the CTE gets evaluated twice but the CTE is deterministic. Any help here would be really great. I will post the CTE definition below.

    https://sqlroadie.com/

  • --removed duplicate post

    https://sqlroadie.com/

  • This is an interesting one Arjun! Can you post the actual plan please?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Yes, very interesting. I will need more time to prepare the sample tables. I am not supposed to post the original plan, can get fired for that.

    https://sqlroadie.com/

  • I tried to reproduce the problem using the following query, which does pretty much the same thing (warning: the query will not make any logical sense). However, I could not reproduce it. I'm still able to reproduce the problem on our development server, using the original query.

    An interesting observation is the original query worked fine when I specified MAXDOP as 1 in the final select.

    ;WITH cte

    AS

    (

    SELECT

    o.name [tablename],

    SUM(ac.max_length) [ColLength],

    ROW_NUMBER() OVER (PARTITION BY o.name ORDER BY SUM(ac.max_length)) AS cnt

    FROM

    sys.objects o

    INNER JOIN sys.COLUMNS c

    ON c.OBJECT_ID = o.object_id

    INNER JOIN sys.all_columns ac

    ON ac.NAME = c.NAME AND ac.OBJECT_ID = c.OBJECT_ID

    GROUP BY

    o.name,

    ac.max_length

    )

    SELECT

    tablename, SUM(ColLength) * -1 [ColLength]

    FROM

    (

    SELECT tablename, Collength * 2 [ColLength] FROM cte WHERE cnt = 1

    UNION ALL

    SELECT tablename, ColLength * 4 [ColLength] FROM cte WHERE cnt = 2

    ) t

    GROUP BY tablename

    https://sqlroadie.com/

  • Update on this:

    I and my teammate have concluded that this a SQL bug - not easily reproducible. The initial query, however, always produces wrong result. We ran the same query with MAXDOP(1) and higher values. When MAXDOP(1) was specified, we got correct results. However, for each other number specified, we got a different value.

    There is a similar problem which was reported by a use to MS, however, in that case, the CTE definition was non-deterministic as the ORDER BY in his ROW_NUMBER() was on duplicate values. We do plan to report this to MS, once we find time to produce sample data. Cheers!

    https://sqlroadie.com/

  • Are the tables partitioned in any way?

  • I think yes; I'm quite new to this project, so, I will have to check before I confirm. I will update tomorrow.

    https://sqlroadie.com/

  • I checked this up and yes, the tables which are used in the CTE definition (very first query in my first post) are partitioned. The partition is on a column BatchID. This column appears in the ON clause of a JOIN in the CTE.

    https://sqlroadie.com/

  • Can you reproduce the error by setting up tables that look similar in structure and data type and recreate the issue?

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply