Simplify Large Queries with Temporary Tables, Table Variables and CTEs

  • Comments posted to this topic are about the item Simplify Large Queries with Temporary Tables, Table Variables and CTEs

  • Nice article, and definitely on topic for real database developers...

    However the more relevant question we are often faced with is what performs better - table variables, temp tables, or CTE's or leave the damn thing in one huge query ?

    An 'It depends' answer would be ok, if expanded on!

  • A very good article , it really enhances the performance just by using the identity columns in #Temp and ##Temp tables rather to using order by while inserting the rows , specifically when you nedd to loop through this data at some later point in the session.

  • I'd echo the above point: would be very interesting to see some analysis of each technique performs.

  • I use CTE queries a lot when building datasources for my reports. I prefer them over #temp tables and @tables (because there is more work involved creating those)

    Once you go slightly overboard with CTEs you'll notice that technique isn't working for a specific problem. A CTE can be running fast, you make a slight change and it runs really really bad. That's where I usually start looking at a CTE > insert that set into a #temp table and finish it off with a SELECT or another CTE against the #temp table.

    Those occasions are rare though.

    Nice article, cheers.

  • Josh Ashwood (8/9/2011)


    However the more relevant question we are often faced with is what performs better - table variables, temp tables, or CTE's or leave the damn thing in one huge query ?

    It depends. Really.

    It'll vary with the size and complexity of what you're doing. Simpler queries will probably be better kept in one or with CTEs (if it makes things easier to read). More complex ones may benefit from being split up and run in bits.

    The major difference that you must realise is that temp tables and table variables are both physical structures - they both store the data in an intermediate state. CTEs are not tables, they're just named subqueries, so they're not splitting out the processing, doing some now some later. It's still a single query.

    Table variables, because of no statistics, can often be problematic on larger row counts (where anything over a couple hundred is probably large)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • My comment in agreement with what GilaMonster has said, is to be very carful do do proper performance and scale testing if you decide to use table variables, they can perform very badly on large data sets.

    As an annecdotal example, I have inherited a number of very large and complex queries that use table variables, some of these were taking more than half an hour to run. Doing nothing to them other than changing the table variables to temporary table gets them returning in 30 seconds or so.

  • The article is really nice, but its very important to consider the performance before using a table variables, temp tables, or CTE's.

    In terms of performance I think we should avoid using a temp table both in a stand alone query or in a SP, as the SQL optimizer can't reuse the execution plan for a query/sp with a temp table. It creates a new execution plan each time we reexecute the query/SP. In case of a large SP using a code block with a temp table is very much prone to face a performance drawback because the SQL optimizer is unable to reuse the execution plan each time we reexecute the SP with changed input parameters.

    However we use tablevariable or CTE for the same requirement which doesn't hamper the performance.

  • One thing that I have found that can cause a temporary table or table variable to perform better than a CTE is that you can create indexes on them. This will help when joining them to other tables in later processing. No, I have not actually captured statistics on this, but have noticed a perceived performance boost. As with everything SQL Server, this will all be impacted by other tasks running at the same time.

    I will say that I have found that a series of CTE's, each building on the last, does better than the massive join's that I have encountered.

  • ben.reese (8/9/2011)


    My comment in agreement with what GilaMonster has said, is to be very carful do do proper performance and scale testing if you decide to use table variables, they can perform very badly on large data sets.

    As an annecdotal example, I have inherited a number of very large and complex queries that use table variables, some of these were taking more than half an hour to run. Doing nothing to them other than changing the table variables to temporary table gets them returning in 30 seconds or so.

    I agree with ben.reese and GilaMonster...I also have run into situations where doing nothing more than moving sub-queries into CTEs or temp tables has increased performance drastically. You definitely need to test each situation as it comes up and just keep in mind that you have several options available.

  • It is a concept to solve complex queries

    Thanks

  • A good introduction to the alternatives, but I take exception to the statement that "it is impractical to create new views for every query you write." Certainly, you wouldn't create a new view for EVERY query, just as you won't use Temp Tables or CTEs in EVERY query.

    For the most part CTEs replace subqueries and potentially make the queries more readable, but don't seem to offer much in the way of performance improvement. They do this by allowing reuse (like a view), but also by keeping the entire query contained. This can be good but also reduces code reuse--a view can be used in more than one query. Their main functional benefit over a view is their ability to use the parameters in a SP. This benefit is shared by a TVF, however there are limitations to what can be done in a function, but again Functions are available for code reuse while CTEs are not.

    Any design decision will, of course, need to be made in the context of a specific need, balancing code reuse, performance, maintainability, and security.

    As Always YMMV,

    --

    JimFive

  • Table variables and CTEs are stored in memory (if sufficient memory is available), while temporary tables are always stored on disk.

    This is untrue, or at best misleading. From http://support.microsoft.com/kb/305977:

    Q4: Are table variables memory-only structures that are assured better performance as compared to temporary or permanent tables, because they are maintained in a database that resides on the physical disk?

    A4: A table variable is not a memory-only structure. Because a table variable might hold more data than can fit in memory, it has to have a place on disk to store data. Table variables are created in the tempdb database similar to temporary tables. If memory is available, both table variables and temporary tables are created and processed while in memory (data cache).

    (Emphasis added).

    Several people have written about this. Here's one[/url], here's another[/url].

    Rich

  • A temporary table is available as long as the database connection with which it was executed is still open. If it is declared with the "##" prefix, it is a global variable available to other sessions. If it is declared with the "#" prefix, it is a local variable available only in the current session.

    Worth mentioning here that in the case of global temp tables(##), which actually are kind of rarely used, they remain active until there is at least one open connection that uses them, regardless if it is the one that created it or another. The local temp tables(#) are only available to the connection that created them, of course.

    Don't just give the hungry man a fish, teach him how to catch it as well.

    the sqlist

  • 1) The primary if not only reason do break down a complex query into multiple steps with interim storage (using TEMPORARY TABLES, NOT TABLE VARIABLES as discussed in number 2 below) is to ensure optimal performance of said query. You absolutely should NOT break down a single query into multiple steps just for 'human readability'.

    2) Table variables should almost NEVER be used. The two exceptions I can think of are if you have a VERY high call situation where recompilations related to temp table activity are causing performance degredation or if you need interim data to remain in place for auditing purposes after a transaction rollback. Others mention performance problems from table variables with lots of rows. Well, I can show you an example that gives bad performance with just a SINGLE row in a table variable.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 15 posts - 1 through 15 (of 39 total)

You must be logged in to reply to this topic. Login to reply