Row count differences with CTE vs Temp table Vs view

  • Hi All,

    Just for curiosity to see if anyone else has come across an issue like this - SQL 2014 - not sure of the SP/CU at the moment

    rather similar queries
    1 - single query made of 2 CTE's and final join to 2 tables
    2 - insert into a temp table then join with the 2 extra tables
      the insert onto the temp table is the output of the 2 CTE's from point 1
      the join is then from the temp table to the 2 extra tables

    using a view instead of the temp table also results on static row count as expected

    database is static (on my own instance) and no one else using this instance.

    both of these tables contain the same clustered PK on _SQN and _LDTS columns
    DV.S_DatabasesInstanceUsers
    DV.S_DatabasesInstanceUsers_H

    If using the temp table/view the row count is always the same - 74742

    if using only CTE the row count varies depending on the final order by or even on the execution - in some cases without changing anything I execute this twice in a row and the row count differs by 1 or 2 rows
    order by 1 - normally rowcount of 74799 to 74802 but can go as high as 74832
    order by 2 - same
    order by 3, 4 or 5 - 74742

    in the cases where the row count is wrong columns s1_LDTS and s2_LDTS (from the second CTE) are returned as nulls

    instance with maxdop 8 and 4GB ram


    Query 1

    if object_id('tempdb..#t') is not null
    drop table #t

    ;with all_dates
    as (select DatabasesInstanceUsers_SQN
       , DatabasesInstanceUsers_LDTS
      from DV.S_DatabasesInstanceUsers s1
      union
      select DatabasesInstanceUsers_H_SQN
       , DatabasesInstanceUsers_H_LDTS
      from DV.S_DatabasesInstanceUsers_H s2
    ),
    pitt
    as (select ld.DatabasesInstanceUsers_SQN
       , ld.DatabasesInstanceUsers_LDTS
      , max(s1.DatabasesInstanceUsers_LDTS) over (partition by ld.DatabasesInstanceUsers_SQN
                      order by ld.DatabasesInstanceUsers_LDTS
                    ) s1_LDTS
      , max(s2.DatabasesInstanceUsers_H_LDTS) over (partition by ld.DatabasesInstanceUsers_SQN
                      order by ld.DatabasesInstanceUsers_LDTS
                    ) s2_LDTS
      from all_dates ld
      left join DV.S_DatabasesInstanceUsers s1
       on s1.DatabasesInstanceUsers_SQN = ld.DatabasesInstanceUsers_SQN
       and s1.DatabasesInstanceUsers_LDTS = ld.DatabasesInstanceUsers_LDTS
      left join DV.S_DatabasesInstanceUsers_h s2
       on s2.DatabasesInstanceUsers_H_SQN = ld.DatabasesInstanceUsers_SQN
       and s2.DatabasesInstanceUsers_H_LDTS = ld.DatabasesInstanceUsers_LDTS
    )
    select *
    into #t
    from pitt

    select pitt.DatabasesInstanceUsers_SQN
      , pitt.DatabasesInstanceUsers_LDTS
      , pitt.s1_LDTS
      , pitt.s2_LDTS
      , s1.DatabasesInstanceUsers_LDTS
      , s2.DatabasesInstanceUsers_H_LDTS
    from #t pitt

    left outer join DV.S_DatabasesInstanceUsers s1
      on s1.DatabasesInstanceUsers_SQN = pitt.DatabasesInstanceUsers_SQN
      and s1.DatabasesInstanceUsers_LDTS = pitt.s1_LDTS
    left outer join DV.S_DatabasesInstanceUsers_H s2
      on s2.DatabasesInstanceUsers_H_SQN = pitt.DatabasesInstanceUsers_SQN
      and s2.DatabasesInstanceUsers_H_LDTS = pitt.s2_LDTS
    order by 1


    query 2

    ;with all_dates
    as (select DatabasesInstanceUsers_SQN
       , DatabasesInstanceUsers_LDTS
      from DV.S_DatabasesInstanceUsers s1
      union
      select DatabasesInstanceUsers_H_SQN
       , DatabasesInstanceUsers_H_LDTS
      from DV.S_DatabasesInstanceUsers_H s2
    ),
    pitt
    as (select ld.DatabasesInstanceUsers_SQN
       , ld.DatabasesInstanceUsers_LDTS
      , max(s1.DatabasesInstanceUsers_LDTS) over (partition by ld.DatabasesInstanceUsers_SQN
                      order by ld.DatabasesInstanceUsers_LDTS
                    ) s1_LDTS
      , max(s2.DatabasesInstanceUsers_H_LDTS) over (partition by ld.DatabasesInstanceUsers_SQN
                      order by ld.DatabasesInstanceUsers_LDTS
                    ) s2_LDTS
      from all_dates ld
      left join DV.S_DatabasesInstanceUsers s1
       on s1.DatabasesInstanceUsers_SQN = ld.DatabasesInstanceUsers_SQN
       and s1.DatabasesInstanceUsers_LDTS = ld.DatabasesInstanceUsers_LDTS
      left join DV.S_DatabasesInstanceUsers_h s2
       on s2.DatabasesInstanceUsers_H_SQN = ld.DatabasesInstanceUsers_SQN
       and s2.DatabasesInstanceUsers_H_LDTS = ld.DatabasesInstanceUsers_LDTS
    )

    select pitt.DatabasesInstanceUsers_SQN
      , pitt.DatabasesInstanceUsers_LDTS
      , pitt.s1_LDTS
      , pitt.s2_LDTS
      , s1.DatabasesInstanceUsers_LDTS
      , s2.DatabasesInstanceUsers_H_LDTS
    from pitt pitt

    left outer join DV.S_DatabasesInstanceUsers s1
      on s1.DatabasesInstanceUsers_SQN = pitt.DatabasesInstanceUsers_SQN
      and s1.DatabasesInstanceUsers_LDTS = pitt.s1_LDTS
    left outer join DV.S_DatabasesInstanceUsers_H s2
      on s2.DatabasesInstanceUsers_H_SQN = pitt.DatabasesInstanceUsers_SQN
      and s2.DatabasesInstanceUsers_H_LDTS = pitt.s2_LDTS
    order by 1

  • I'd guess that...
    partition by ld.DatabasesInstanceUsers_SQN order by ld.DatabasesInstanceUsers_LDTS
    ...is not producing the same results each time you run it. you have a tie in there or something.


    "If I had been drinking out of that toilet, I might have been killed." -Ace Ventura

  • no ties - within each sqn the ldts are unique

  • Not even going to try since all we have are the queries,

  • Exec plans at least to try to answer this.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • don't see why it would make a different but try swapping the UNION for UNION ALL

    UNION will drop duplicated records but UNION ALL won't - which could result in cartesian joins - but it might point you in the right direction.

    Failing that, run both resutls set to temp tables and do an EXCEPT to identify the differences

  • aaron.reese - Thursday, December 14, 2017 5:11 AM

    don't see why it would make a different but try swapping the UNION for UNION ALL

    UNION will drop duplicated records but UNION ALL won't - which could result in cartesian joins - but it might point you in the right direction.

    Failing that, run both resutls set to temp tables and do an EXCEPT to identify the differences

    A UNION ALL will not produce a Cartesian (CROSS) join.  A UNION ALL is equivalent to addition, whereas a Cartesian product is equivalent to multiplication.  The two are not at all the same.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • Sorry Drew

    What I meant was that using UNION ALL could result in duplicate rows which when extended through a join would then effectively create a cartesian results set.  Yes you are correct UNION ALL is additive.

  • aaron.reese - Tuesday, December 19, 2017 7:35 AM

    Sorry Drew

    What I meant was that using UNION ALL could result in duplicate rows which when extended through a join would then effectively create a cartesian results set.  Yes you are correct UNION ALL is additive.

    No a union all does not effectively create a cartesian result set no matter how it's worded. Even if you have the same rows in two tables, union all and a cartesian product are not the same. The results in terms of how it works are important -  Add vs Multiply. Just add the number of rows in each or multiple the number of rows in each. It could look the same if you are doing 2 rows and 2 rows (2 + 2 = 4 and 2 * 2 = 4) but they are still totally different. Test the following and play with the number of rows in each table:
    create table FirstTable
    (SomeID int ,
    SomeDesc varchar(50) )

    create table SecondTable
    (SomeID int ,
    SomeDesc varchar(50) )

    insert into FirstTable
    values
    (1, 'one'),
    (2, 'two'),
    (3, 'three')

    insert into SecondTable
    values
    (1, 'one'),
    (2, 'two'),
    (3, 'Three')

    --cartesian - multiplication means 9 rows (3 rows in First table * 3 rows in second table)
    select *
    from FirstTable, SecondTable

    --union all - additon means 6 rows (3 rows in First table + 3 rows in Second table)
    select *
    from FirstTable
    union all
    select *
    from SecondTable

    Sue

  • aaron.reese - Tuesday, December 19, 2017 7:35 AM

    Sorry Drew

    What I meant was that using UNION ALL could result in duplicate rows which when extended through a join would then effectively create a cartesian results set.  Yes you are correct UNION ALL is additive.

    Still wrong.  A Cartesian product is only a CROSS JOIN by definition.  An INNER JOIN or OUTER JOIN are distinct from a CROSS JOIN, so they are distinct from a Cartesian product no matter how many duplicates any one row may produce.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply