CTE race condition?

  • Today on SQL 2008 I was working on a query that seemed to run fine on production in the morning and at night; but would fail during the day. It was something like:

    WITH Something

    AS (SELECT Blah,

    Blah2

    FROM BlahTable)

    SELECT *

    FROM Something

    CROSS APPLY (SELECT Count(DISTINCT Blah) AS BlahCount,

    Count(DISTINCT Blah2) AS Blah2Count

    FROM Something) SomethingCount

    What was happening was that BlahCount and Blah2Count would end up different to the number of actual distinct records returned.

    I thought that the CTE (it's a non-recursive one) constructed an in-memory table, and that as both the select and count worked on the same in-memory table, there'd work on that and there'd be no race condition. So why didn't it work in practice? (I ended up solving the problem in a different way, but I'd like to know what caused it, and obviously the real query was far more complex, haha).

  • I am taking a guess here and am no way sure about this, but from what I remember of CTE execution plans, the data is spooled out. I guess it is possible that the data in memory is changing as it is getting spooled out. Again, I would say I am less than 5% sure this is the cause, but it is the only thing that make sense to me right now.

    Anyone else have any thoughts?

    Fraggle.

  • this is certainly something that i would be interested in knowing as well. just out of curiosity really.

  • What does your execution plan for the query show? Could you post it here please?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • It may help to think of CTEs as views, since that's how they behave. Query plans generated from a CTEs may sometimes make use of a spool, but not all will.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • The Dixie Flatline (7/8/2011)


    What does your execution plan for the query show? Could you post it here please?

    Ditto on the execution plan. I too am curious.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • www.ssrstips.com (7/6/2011)


    I thought that the CTE (it's a non-recursive one) constructed an in-memory table, and that as both the select and count worked on the same in-memory table, there'd work on that and there'd be no race condition. So why didn't it work in practice? (I ended up solving the problem in a different way, but I'd like to know what caused it, and obviously the real query was far more complex, haha).

    I dislike the term Microsoft used for this: CTE (Common Table Expression) gives some people the impression that it is a table constructed in memory and reused as necessary by the main body of the query. This is exactly wrong.

    As Bob mentioned, non-recursive CTEs are in-line views. Speaking a little loosely, the definition of the CTE is expanded into the query body at each place it is referenced just like a traditional view. The resulting expanded query tree is then optimized and passed for execution.

    The net effect is that in most circumstances, multiple references to a CTE result in the expression being evaluated multiple times. Concurrent changes to the tables/views referenced by the CTE mean that each invocation may indeed return different results, unless a row-versioning isolation level (RCSI or SI) is in effect, or the isolation level is SERIALIZABLE (REPEATABLE READ is not sufficient). The other alternative is to materialize the CTE in a table variable or temporary table first, and then reference the (static) contents of that temporary structure in your query.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply