CTE Performance Compared to a Temp Table

  • We have an interesting problem where the results were not what was expected.

    We are trying to maximize the query performance of a frequently used query. Where it gets slow is combining the results of full text searches with the rest of the query. I realize that since the whole query it not shown here, and the names have been changed to protect the innocent, the answer may not be obvious.

    The first query looks roughly like this:

    DECLARE @searchresults TABLE (ID NUMERIC(26,0) PRIMARY KEY)

    INSERT INTO @searchresults (ID)

    SELECT DISTINCT masterRecordID FROM NoteDetails WHERE contains(NoteText, @SearchText)

    UNION

    SELECT DISTINCT id FROM masterTable m WHERE contains((Ref, Summary, desc, resolution),@SearchText)

    SELECT ...

    FROM masterTable m INNER JOIN ....

    WHERE ... AND EXISTS (SELECT * FROM @searchresults WHERE ID = m.ID))

    The second query looks like this:

    WITH SearchResults (ID) AS (

    SELECT DISTINCT masterRecordID FROM NoteDetails WHERE contains(NoteText, @SearchText)

    UNION

    SELECT DISTINCT id FROM masterTable WHERE contains((Ref, Summary, desc, resolution),@SearchText)

    )

    SELECT ...

    FROM SearchResults r INNER JOIN masterTable m ON r.ID=m.ID

    INNER JOIN ....

    WHERE ....

    The database is well-indexed, so that the query inside of the CTE is very fast (< 1 sec), and the main SELECT is very fast (< 1 sec). The ID column of masterTable is a primary key, and the masterRecordID of the NoteDetails table is an indexed foreign key.

    When trying to correlate the search results with the main select using the first query (the temporary table), the results come out very fast (~ 2 sec or less).

    When trying to correlate the CTE search results with the main query using a join, the query is significantly slower (~10 sec), which seems counterintuitive. I would think that joining the search results against the main table, it would perform better than the query that is using the EXISTS against a temporary table.

    Can anyone offer any kind of ideas on why this may be?

  • It is hard to tell from the data you have provided. However, it appears that the second query (with the CTE) is actually a correlated sub-query.

    (You join the CTE to MasterTable, but MasterTable is also part of the CTE.) If true, that would probably explain it. Correlated sub-queries are horrible for performance, since the sub-query (the CTE in this case) gets executed once for each record in the main query.

  • What happens if you use the CTE in an EXISTS clause in the WHERE clause like you are using the temporary table?

  • Can you post execution plans for each of the two methods?

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

  • I think I have a better understanding of the problem. This sounds like a classic mismatch between what I expected would happen and what actually happens.

    My expectation was that the CTE would get evaluated first into the equivalent of a temporary table, and could then be used in the SELECT. What I think I hear actually happens is that the CTE behaves more like a classic C macro - it is evaluated more like a subquery within the SELECT.

    I'll play around with it more tomorrow, but I think the first reply was on the money.

  • I would try dropping off the PK from the table variable (note you do NOT have a temporary table here - it is a table var with many things that come along with that, most of them undesirable in many settings). I can't count the number of times I have REMOVED PKs and/or indexes from temp tables/table vars at clients and gotten a perf win.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (12/30/2011)


    I would try dropping off the PK from the table variable (note you do NOT have a temporary table here - it is a table var with many things that come along with that, most of them undesirable in many settings). I can't count the number of times I have REMOVED PKs and/or indexes from temp tables/table vars at clients and gotten a perf win.

    I think he's saying the version with the table variable worked ok, whereas the CTE was the slow version. However its an interesting note you have here that removing indexes would increase performance, is it because they're temp tables / table vars or just an index interferring with performance in general somehow?

  • CTE's work kinda like a view-on-the-fly and SQL Server sometimes balks at the complexity of the resolved query. Chained CTE's are particularly bad, and also those containing UNION as opposed to UNION ALL.

    -- try running the result into a #temp table instead of a variable and indexing on ID (clustered, unique)

    -- performance will likely improve if the result set is more than a few hundred rows

    DECLARE @SearchResults TABLE (ID NUMERIC(26,0) PRIMARY KEY)

    INSERT INTO @SearchResults (ID)

    SELECT masterRecordID FROM NoteDetails WHERE contains(NoteText, @SearchText)

    UNION

    SELECT id FROM masterTable m WHERE contains((Ref, Summary, desc, resolution),@SearchText)

    -- try this as an alternative

    SELECT ...

    FROM masterTable m

    INNER JOIN ....

    WHERE ...

    AND (EXISTS (SELECT 1 FROM NoteDetails WHERE masterRecordID = m.ID AND contains(NoteText, @SearchText))

    OR EXISTS (SELECT 1 FROM masterTable WHERE ID = m.ID AND contains((Ref, Summary, desc, resolution),@SearchText)))


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • patrickmcginnis59 (12/30/2011)


    TheSQLGuru (12/30/2011)


    I would try dropping off the PK from the table variable (note you do NOT have a temporary table here - it is a table var with many things that come along with that, most of them undesirable in many settings). I can't count the number of times I have REMOVED PKs and/or indexes from temp tables/table vars at clients and gotten a perf win.

    I think he's saying the version with the table variable worked ok, whereas the CTE was the slow version. However its an interesting note you have here that removing indexes would increase performance, is it because they're temp tables / table vars or just an index interferring with performance in general somehow?

    PK/Indexes on temp table/table vars are often completely useless to the optimization process and query plan. Thus you pay the cost to build them (which is especially high for a clustered PK) only to do a table scan anyway. many times I will see a PK/indexed temp table that is simply selected from on output with no where clause. Or is joined to other tables and the entire temp table needs to be read anyway. So many times it is a complete waste to have the index(es).

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (12/30/2011)


    patrickmcginnis59 (12/30/2011)


    TheSQLGuru (12/30/2011)


    I would try dropping off the PK from the table variable (note you do NOT have a temporary table here - it is a table var with many things that come along with that, most of them undesirable in many settings). I can't count the number of times I have REMOVED PKs and/or indexes from temp tables/table vars at clients and gotten a perf win.

    I think he's saying the version with the table variable worked ok, whereas the CTE was the slow version. However its an interesting note you have here that removing indexes would increase performance, is it because they're temp tables / table vars or just an index interferring with performance in general somehow?

    PK/Indexes on temp table/table vars are often completely useless to the optimization process and query plan. Thus you pay the cost to build them (which is especially high for a clustered PK) only to do a table scan anyway. many times I will see a PK/indexed temp table that is simply selected from on output with no where clause. Or is joined to other tables and the entire temp table needs to be read anyway. So many times it is a complete waste to have the index(es).

    No stats on a table variable - not so with a temp table. Many times I've seen an unindexed temp table of 10k rows used in a 30-60s query, indexed in a half-second, and query time reduced from 40s to a second or less...

    Same as always - test and compare.


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • ChrisM@home (12/30/2011)


    TheSQLGuru (12/30/2011)


    patrickmcginnis59 (12/30/2011)


    TheSQLGuru (12/30/2011)


    I would try dropping off the PK from the table variable (note you do NOT have a temporary table here - it is a table var with many things that come along with that, most of them undesirable in many settings). I can't count the number of times I have REMOVED PKs and/or indexes from temp tables/table vars at clients and gotten a perf win.

    I think he's saying the version with the table variable worked ok, whereas the CTE was the slow version. However its an interesting note you have here that removing indexes would increase performance, is it because they're temp tables / table vars or just an index interferring with performance in general somehow?

    PK/Indexes on temp table/table vars are often completely useless to the optimization process and query plan. Thus you pay the cost to build them (which is especially high for a clustered PK) only to do a table scan anyway. many times I will see a PK/indexed temp table that is simply selected from on output with no where clause. Or is joined to other tables and the entire temp table needs to be read anyway. So many times it is a complete waste to have the index(es).

    No stats on a table variable - not so with a temp table. Many times I've seen an unindexed temp table of 10k rows used in a 30-60s query, indexed in a half-second, and query time reduced from 40s to a second or less...

    Same as always - test and compare.

    I've had similar experiences. I just want to reiterate the test and compare

    Jason...AKA CirqueDeSQLeil
    _______________________________________________
    I have given a name to my pain...MCM SQL Server, MVP
    SQL RNNR
    Posting Performance Based Questions - Gail Shaw[/url]
    Learn Extended Events

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply