CTE and UNION performance issue

  • I am optimizing an inline table-valued function that performs calculations by heavily using OVER, LEAD and LAG. At one stage, records are doubled by performing a UNION. My statement can be schematized as:
    WITH
        Q1 AS (
            SELECT StartDate, EndDate FROM ...
        ),
        Q2 AS (
            SELECT StartDate AS EffectiveDate FROM Q1
            UNION ALL
            SELECT EndDate AS EffectiveDate FROM Q1
        ),
        Q3 AS (
            SELECT EffectiveDate FROM Q2
        )
    On my test database, computing Q1 takes 15 seconds but computing Q2 takes 30 seconds, as if Q1 is computed twice. If I store Q1 into a temporary table then performing the UNION on this table only takes 15 seconds, which proves my assertion. My problem is that my function will not be inline anymore if I need to use temporary tables.
    Does someone know how to trick the compiler so it does not recalculate Q1? I am testing on SQL Server 2014. Does someone know if SQL Server 2016 has better execution plans?
  • cmartel 20772 - Wednesday, May 24, 2017 1:09 PM

    I am optimizing an inline table-valued function that performs calculations by heavily using OVER, LEAD and LAG. At one stage, records are doubled by performing a UNION. My statement can be schematized as:
    WITH
        Q1 AS (
            SELECT StartDate, EndDate FROM ...
        ),
        Q2 AS (
            SELECT StartDate AS EffectiveDate FROM Q1
            UNION ALL
            SELECT EndDate AS EffectiveDate FROM Q1
        ),
        Q3 AS (
            SELECT EffectiveDate FROM Q2
        )
    On my test database, computing Q1 takes 15 seconds but computing Q2 takes 30 seconds, as if Q1 is computed twice. If I store Q1 into a temporary table then performing the UNION on this table only takes 15 seconds, which proves my assertion. My problem is that my function will not be inline anymore if I need to use temporary tables.
    Does someone know how to trick the compiler so it does not recalculate Q1? I am testing on SQL Server 2014. Does someone know if SQL Server 2016 has better execution plans?

    Maybe something like this:

    CREATE TABLE #SampleData ( StartDate datetime, EndDate datetime);
    INSERT INTO #SampleData
    SELECT DATEADD( mi, ABS( CHECKSUM( NEWID())) % 5000, '2017'), DATEADD( mi, ABS( CHECKSUM( NEWID())) % 5000, '2017')
    FROM sys.all_columns;

    WITH
    Q1 AS (
    SELECT StartDate, EndDate FROM #SampleData
    ),
    Q2 AS (
    SELECT StartDate AS EffectiveDate FROM Q1
    CROSS JOIN (VALUES(1),(2))x(n)
    ),
    Q3 AS (
    SELECT EffectiveDate FROM Q2
    )
    SELECT *
    FROM Q3;

    GO
    DROP TABLE #SampleData;

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Luis,

    I tried you statement (I put a TOP 5 to limit the results!) and it does not return StartDate and EndDate, it rather returns the StartDate twice. 

  • cmartel 20772 - Wednesday, May 24, 2017 1:39 PM

    Luis,

    I tried you statement (I put a TOP 5 to limit the results!) and it does not return StartDate and EndDate, it rather returns the StartDate twice. 

    I'm sorry. I missed the part where you're using different columns. The solution is very similar and it's explained in here
    http://www.sqlservercentral.com/articles/CROSS+APPLY+VALUES+UNPIVOT/91234/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thanks Luis, I replaced:

    SELECT HolderID, EffectiveDate, VotePercentage FROM Q1
    UNION ALL
    SELECT HolderID, EndDate, 0 FROM Q1

    with:

    SELECT L.HolderID, C.EffectiveDate, C.VotePercentage
    FROM Q1 L
    CROSS APPLY (VALUES (L.EffectiveDate, L.VotePercentage), (L.EndDate, 0)) C (EffectiveDate, VotePercentage)

    and the execution time went down from 30 seconds to 16 seconds. This means that Q1 is now only evaluated once.

  • cmartel 20772 - Thursday, May 25, 2017 9:09 AM

    This means that Q1 is now only evaluated once.

    That's correct.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • cmartel 20772 - Wednesday, May 24, 2017 1:09 PM

    I am optimizing an inline table-valued function that performs calculations by heavily using OVER, LEAD and LAG. At one stage, records are doubled by performing a UNION. My statement can be schematized as:
    WITH
        Q1 AS (
            SELECT StartDate, EndDate FROM ...
        ),
        Q2 AS (
            SELECT StartDate AS EffectiveDate FROM Q1
            UNION ALL
            SELECT EndDate AS EffectiveDate FROM Q1
        ),
        Q3 AS (
            SELECT EffectiveDate FROM Q2
        )
    On my test database, computing Q1 takes 15 seconds but computing Q2 takes 30 seconds, as if Q1 is computed twice. If I store Q1 into a temporary table then performing the UNION on this table only takes 15 seconds, which proves my assertion. My problem is that my function will not be inline anymore if I need to use temporary tables.
    Does someone know how to trick the compiler so it does not recalculate Q1? I am testing on SQL Server 2014. Does someone know if SQL Server 2016 has better execution plans?

    Are you, by any chance, grouping together consecutive bunches of days? Is this what the code is for?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Chris,
    Dates are not consecutive. This script transform transactional data into a register. Initial data is contained in two different tables. One contains percentages while the other contains values that will be used to calculate percentages. Initial data may look like:

    table1:
    2001-01-01, Me, 100%
    2001-01-05, Me, 50%
    2001-01-05, You 50%
    2001-01-10, You 100%
    table2:
    2001-01-20, Me, +100 //From this point on, we calculate the percentages. I own 100 out of 100
    2001-01-21,You, +100 //I own 100 out of 200 and you own 100 out of 200

    And reportable result would be

    Me, from 2001-01-01 to 2001-01-05, 100%
    Me, from 2001-01-05 to 2001-01-10, 50%
    Me, from 2001-01-20 to 2001-01-21, 100%
    Me, from 2001-01-21, 50%
    You, from 2001-01-05 to 2001-01-10, 50%
    You, from 2001-01-10 to 2001-01-20, 100%
    You, from 2001-01-21, 50%

    Producing the result requires several steps and my issue was related to one of these steps.

  • cmartel 20772 - Thursday, May 25, 2017 10:11 AM

    Chris,
    Dates are not consecutive. This script transform transactional data into a register. Initial data is contained in two different tables. One contains percentages while the other contains values that will be used to calculate percentages. Initial data may look like:

    table1:
    2001-01-01, Me, 100%
    2001-01-05, Me, 50%
    2001-01-05, You 50%
    2001-01-10, You 100%
    table2:
    2001-01-20, Me, +100 //From this point on, we calculate the percentages. I own 100 out of 100
    2001-01-21,You, +100 //I own 100 out of 200 and you own 100 out of 200

    And reportable result would be

    Me, from 2001-01-01 to 2001-01-05, 100%
    Me, from 2001-01-05 to 2001-01-10, 50%
    Me, from 2001-01-20 to 2001-01-21, 100%
    Me, from 2001-01-21, 50%
    You, from 2001-01-05 to 2001-01-10, 50%
    You, from 2001-01-10 to 2001-01-20, 100%
    You, from 2001-01-21, 50%

    Producing the result requires several steps and my issue was related to one of these steps.

    Fascinating. That's a great challenge, right there.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply