temp table vs subquery

Question

temp table vs subquery

sqlguy-736318

Hall of Fame

Points: 3859
More actions
November 28, 2008 at 2:12 pm

#99375

Which option do you tend to choose as a default and why?

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply

Jack Corbett SSC Guru Points: 184393 More actions · Answer 1

It depends. I'd use a temp table or table variable if I am updating/deleting data from that set of data or accessing the data multiple times and now with CTE's I'd use a CTE in other situations as I find them more readable than derived tables or sub-queries.

Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 2

Jack is correct, it depends. Also in deciding between temp tables and table variables it depends not only on the usage but the size of the data set as well.

I definitely tend to use CTE's over derived tables, cleaner code for one, and two if you need the same "table" more than once (careful how many times to use the CTE, a view or temp table may be better) in the same query its defined once and referenced just like a table.

Jeff Moden SSC Guru Points: 1003861 More actions · Answer 3

I tend to steer clear of using Table Variables... there's a whole lot of reasons for that but the biggest reason is that when in Query Analyzer or SMS for troubleshooting, they go away as soon as the run is over, just like any other variable. Since Temp Tables persist in such a session, you don't have to "pollute" the code with a bunch of Select's to see what's in them.

Before folks get into what could be a heated debate, Table Variables, Temp Tables, and Derived Tables (sub queries in the FROM clause) all live in the same places... they all start out in memory and if they "spill over", they spill over into TempDB.

Don't take my word for it though... here's an interesting FAQ article on Temp Tables and Table Variables... Q3/A3 and Q4/A4 where of particular interest to me...

http://support.microsoft.com/default.aspx?scid=kb;en-us;305977&Product=sql2k

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 4

One good place for table variables is in testing insert/update/delete testing where you know you will be rolling back the changes with an explicit BEGIN TRANSACTION/ROLLBACK TRANSACTION, but want to capture those changes for review (ie loading them into a permanent or temporary table outside of the transaction) since they are not affected by the rollback.

edwardelliott SSCommitted Points: 1863 More actions · Answer 5

I think it is down to how you choose to work and what works best in a specific circumstance, apart from being able to create indexes and statistics on temp tables - I personally prefer table variables as I don't have to drop them everytime I run a query, that is my personal choice.

As with any other queries you need to test and see which is fastest and most suits your needs.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

Jeff Moden (11/28/2008)
Before folks get into what could be a heated debate, Table Variables, Temp Tables, and Derived Tables (sub queries in the FROM clause) all live in the same places... they all start out in memory and if they "spill over", they spill over into TempDB.

Temp tables and table variables yes, but derived tables aren't 'stored' in the majority of cases. They're part of the query and, unless there are complex, horrendous, messes in the subquery or non-equality joins, the query processor will run the entire query, with derived table, without needing to store intermediate results.

For complex queries with complex subqueries where the query processor does need an intermediate result stored (table spool/index spool) then those can spill to disk if they won't fit in the memory that's available for the processing of that query.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1003861 More actions · Answer 7

So you're saying that a derived table takes no memory?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Lynn Pettis SSC Guru Points: 442467 More actions · Answer 8

Jeff Moden (11/29/2008)
So you're saying that a derived table takes no memory?

Actually, I think she was saying that unless necessary, derived tables only exist in memory.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 9

Jeff Moden (11/29/2008)
So you're saying that a derived table takes no memory?

Considering that a derived table in a query isn't usually executed in 2 steps (ie, the query processor doesn't (usually) take the query that forms the derived table, run that, store the result somewhere, then run the outer query and join the stored result in), that's exactly what I'm saying.

The term 'derived table' is kinda misleading, same as a cte (common table expression). The table in the name does not mean that they are stored anywhere, memory or disk.

That said, there are times (complex subqueries, complex outer queries, non-equality joins, etc) where that has to be done and that's when you start seeing table spools or index spools in an execution plan. Those are temporarily stored results and they can, if they get big, spill to disk.

Let's take a fairly trivial example.

SELECT p.ProductID, ProductNumber, SumTotal, SumDiscount

FROM Production.Product p

INNER JOIN (

SELECT productid, SUM(LineTotal) SumTotal, SUM(LineTotal*UnitPriceDiscount) SumDiscount

FROM Sales.SalesOrderDetail

GROUP BY ProductID

HAVING SUM(LineTotal*UnitPriceDiscount) > SUM(LineTotal)*0.01

) SalesTotals on p.ProductID = SalesTotals.ProductID

WHERE ProductNumber like 'bk%'

If I were to run that, the exec plan (DerivedTables1 in the attached zip) shows no spools of any kind, hence no interim storage of the derived table. In fact, the plan is similar (but not exactly identical) to that of this query and, according to the optimiser is the same cost. The only difference is which table is used for the hash probe.

SELECT p.ProductID, ProductNumber, SUM(LineTotal) SumTotal, SUM(LineTotal*UnitPriceDiscount) SumDiscount

FROM Production.Product p

INNER JOIN Sales.SalesOrderDetail s on p.ProductID = s.ProductID

WHERE ProductNumber like 'bk%'

GROUP BY p.ProductID, ProductNumber

HAVING SUM(LineTotal*UnitPriceDiscount) > SUM(LineTotal)*0.01

Now, if I make the derived table nastier (with a non-equality join) we do get a spool and hence there is an interim storage of results during the query's execution. (DerivedTables1 in the attached zip)

SELECT p.ProductID, ProductNumber, SumTotal, SumDiscount

FROM Production.Product p

INNER JOIN (

SELECT productid, SUM(LineTotal) SumTotal, SUM(LineTotal*UnitPriceDiscount) SumDiscount

FROM Sales.SalesOrderDetail

GROUP BY ProductID

HAVING SUM(LineTotal*UnitPriceDiscount) > SUM(LineTotal)*0.01

) SalesTotals on p.ProductID > SalesTotals.ProductID

WHERE ProductNumber like 'bk%'

Interestingly though, it's not the derived table that gets spooled. It's the product table, the table in the outer query, that gets put into a spool and repeatedly read and rewound.

If you like, I can dig into the DMVs ans see if I can find the memory grant for each of those queries. It shhould be stored somewhere.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1003861 More actions · Answer 10

Nope... no digging required. That's basically what I meant but a whole lot shorter. 😉 Your good explanation is, of course, the most accurate... I was just trying to give a 30,000 ft overview.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 11

Jeff Moden (11/29/2008)
Your good explanation is, of course, the most accurate... I was just trying to give a 30,000 ft overview.

No prob. I was just worried that someone who didn't understand the nuances would run across the thread, take the 30,000 ft overview as gospal truth in all cases and suddenly we have a new myth floating around that derived tables are written to disk.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1003861 More actions · Answer 12

GilaMonster (11/30/2008)
Jeff Moden (11/29/2008)
Your good explanation is, of course, the most accurate... I was just trying to give a 30,000 ft overview.
No prob. I was just worried that someone who didn't understand the nuances would run across the thread, take the 30,000 ft overview as gospal truth in all cases and suddenly we have a new myth floating around that derived tables are written to disk.

Agreed... thanks again... good stuff, Gail.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ChiragNS One Orange Chip Points: 26137 More actions · Answer 13

Gail

Now i could be completely wrong....but

would'nt it make derived table queries slightly more resource intensive than using temp tables. I mean to say that using temp tables we could do things in short steps rather than have a comparitively complex query being executed.

"Keep Trying"

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 14

Depends what you're doing.

Usually derived tables are less intensive because there's no interim storage and no need to insert stuff somewhere and then query it out. If you're doing a query that's a page or 2 long, it probably needs to be broken up, but for simpler stuff, temp tables are going to add complexity and time.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass