PIVOT taking forever unless temp table var used

  • This is a bit of an odd one for me..

    I have a query that returns:

    Response_ID UNIQUEIDENTIFIER,

    Alias VARCHAR(MAX),

    Answer_Raw VARCHAR(MAX)

    This query is quick.. returning in less than a second.

    If I try and run this:

    SELECT  
    Response_ID,
    [Gender],
    [YoB],
    [AgeAtEnroll],
    [Ethnicity],
    [Race_americanindian],
    [Race_asian],
    [Race_black],
    [Race_nativehawaiian],
    [Race_white],
    [Race_unknown],
    [Race_notreported],
    [Handedness],
    [Employment],
    [EnglishPrimaryLang],
    [LearnedEnglishAge],
    [EducationLevel],
    [MotherEducationLevel],
    [SzHistory],
    [NumFamilySz],
    [FamilySinastrality],
    [LearningDisability],
    [RepeatGrade],
    [SpecialAccomCoursework]
    FROM
    (SELECT Response_ID, Alias, Answer_Raw FROM Survey_Answers WHERE Survey_ID = 2) a
    PIVOT
    (
    MAX(Answer_Raw)
    FOR Alias IN (
    [Gender],
    [YoB],
    [AgeAtEnroll],
    [Ethnicity],
    [Race_americanindian],
    [Race_asian],
    [Race_black],
    [Race_nativehawaiian],
    [Race_white],
    [Race_unknown],
    [Race_notreported],
    [Handedness],
    [Employment],
    [EnglishPrimaryLang],
    [LearnedEnglishAge],
    [EducationLevel],
    [MotherEducationLevel],
    [SzHistory],
    [NumFamilySz],
    [FamilySinastrality],
    [LearningDisability],
    [RepeatGrade],
    [SpecialAccomCoursework]
    )
    ) AS PivotTable;

    The result can take anywhere from 2-3 minutes.

    However, if I put the results from Survey_Answers into a temporary table variable, the result comes back in less than a second again.  So while I have a workaround, I'm totally unsure as to WHY, and any hints in the right direction would be appreciated.

    DECLARE @tbl TABLE (
    Response_ID UNIQUEIDENTIFIER,
    Alias VARCHAR(MAX),
    Answer_Raw VARCHAR(MAX)
    )

    INSERT INTO @tbl
    SELECT
    Response_ID,
    Alias,
    Answer_Raw
    FROM
    Survey_Answers
    WHERE
    Survey_ID = 2

    SELECT
    Response_ID,
    [Gender],
    [YoB],
    [AgeAtEnroll],
    [Ethnicity],
    [Race_americanindian],
    [Race_asian],
    [Race_black],
    [Race_nativehawaiian],
    [Race_white],
    [Race_unknown],
    [Race_notreported],
    [Handedness],
    [Employment],
    [EnglishPrimaryLang],
    [LearnedEnglishAge],
    [EducationLevel],
    [MotherEducationLevel],
    [SzHistory],
    [NumFamilySz],
    [FamilySinastrality],
    [LearningDisability],
    [RepeatGrade],
    [SpecialAccomCoursework]
    FROM
    @tbl AS SourceTable
    PIVOT
    (
    MAX(Answer_Raw)
    FOR Alias IN (
    [Gender],
    [YoB],
    [AgeAtEnroll],
    [Ethnicity],
    [Race_americanindian],
    [Race_asian],
    [Race_black],
    [Race_nativehawaiian],
    [Race_white],
    [Race_unknown],
    [Race_notreported],
    [Handedness],
    [Employment],
    [EnglishPrimaryLang],
    [LearnedEnglishAge],
    [EducationLevel],
    [MotherEducationLevel],
    [SzHistory],
    [NumFamilySz],
    [FamilySinastrality],
    [LearningDisability],
    [RepeatGrade],
    [SpecialAccomCoursework]
    )
    ) AS PivotTable;

    --------------------------

    I long for a job where my databases dont have any pesky users accessing them 🙂

  • Your original query refers to Survey_Answers table and fetches only 3 columns and has a WHERE clause too. This has been done as part of subquery. Since you have not shared the execution plan for comparison, hence we cannot comment what really is causing such a huge difference in execution time? At first instance it seems the subquery is an accused but without proof can't declare it as convict. There is a nice article A SUB-QUERY DOES NOT HURT PERFORMANCE by Grant Fritchey.

    I have few questions for you :

    1. In the 2nd run of the modified query with the Table Variable, have you also included the time taken for insert into table variable when you got result within a second? In other words, have you executed the INSERT INTO table variable and SELECT with pivot separately or together?
    2. How many times you executed this query? Is it part of any Stored Procedure?

    If you can also share the DDL and some dummy data for us to test then we can test it and share our observations.

  • Post the plans for both queries and it might be possible to suss out what's going on.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • What do you get when you write it with straight criteria instead of selecting from the sub-query?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks all - I thought there might just be some super simple reason like PIVOT just works better with in-memory data etc.

    This example of demographics data takes < 1 second with the temp table var and approximately 16 seconds without it.  Other, larger sets of data take 2-3 seconds and 2-3 minutes respectively.

    A little more detail.  Survey_Answers is a view that sits in another database on the same SQL instance.  The user pulling the data has full access to both databases.  Survey_Answers pulls from 12 tables and with full disclosure includes probably far more information than is really needed for this particular PIVOT query.  However, it still seems super weird to me that inserting into a table variable from the view AND doing the pivot takes less than a second while just doing the pivot directly on the view takes 16 seconds.

    I've attached 4 files

    • without_temp_tbl_var.sql - This is the query where I don't use the @tbl var.  It takes 16 seconds
    • without_temp_tbl_var.sqlplan - The execution plan for the above
    • with_temp_tbl_var.sql - The query not using the @tbl var.  It takes less than a second
    • with_temp_tbl_var.sqlplan - The execution plan for the above

    I appreciate any direction you guys can give me!

    Thanks

    --------------------------

    I long for a job where my databases dont have any pesky users accessing them 🙂

  • torpkev wrote:

    Thanks all - I thought there might just be some super simple reason like PIVOT just works better with in-memory data etc.

    There might be!!! Did you try what I suggested in my previous post?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi Jeff,

    Sorry, I might be dim here - I can't use it without the subquery because I need the where clause on Survey_Answers  (limited to Survey_ID = 2)

    The WHERE clause before the PIVOT doesn't work.. adding the WHERE clause after the PIVOT ran for over a minute before I killed the query.

    Though if you can point out where I'm doing it wrong, that'd be great, I'd love to be able to learn something new 🙂

    --------------------------

    I long for a job where my databases dont have any pesky users accessing them 🙂

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply