RE: Missing Indexes on Temp tables

SSC Guru

Points: 260824

December 17, 2012 at 9:06 am

ScottPletcher (12/17/2012)
TheSQLGuru (12/16/2012)
My statement about making "queries" faster is about the ENTIRE process. The act of populating a temp table with an index already on the temp table before data is added requires not-insignificant work to populate the index(es) as the data is added. This is especially true of the clustered PK scenario. And adding an index after data population requires a full scan of the temp table and the actual effort required to create and store the index. The VAST majority of the times I have seen this done the table is hit ONCE in a join or some other query type and the cumulative cost/duration/effort of adding the index made the entire process run slower.
Gonna stick by my statements here - in my experience most indexes on temp tables do not make for overall faster/more efficient executions (outside of my exception about iterative hits on said temp table where you get an index seek or other significant benefit).
If the data is only being read/used once, why is it being stored in tempdb at all??

I've seen plenty of situations where dumping data into a temp table, then using the temp table, is more efficient than complex sub-queries.

You only use the data once, but it splits up the execution of the final operation into smaller chunks, and the optimizer does a much better job on those than on a single mother-of-all-queries type operation.

Basically, a temp table pre-populated, can often outperform complex CTEs, cross-applies, etc., simply by reducing the complexity the optimizer has to deal with.

In those cases, a reasonable clustered index on the temp table can sometimes help, but also sometimes not.

I've gone both ways on temp tables. Some need indexing. Some need a lack of indexing.

For example, I recently wrote an ETL process that includes very complex business rules. The main query has 18 sub-queries, using Cross or Outer Apply, rooted on data in a single table. Two of the sub-queries were so complex that I moved them to temp tables. Just splitting those 2 out of the main query, took total execution time from over an hour to under a minute. Since the process is supposed to run hourly, that's a critical improvement. But they're both technically "temp heaps", since they're completely unindexed. Tried adding a clustered index to each, but it didn't improve anything at all.

On the other side, I once wrote a very complex proc that compared an order to a complex set of "business suggestions", based on heuristic patterns. To simplify the code, and to deal with what would otherwise have been very aggressive locking issues, I used temp tables with some serious indexing, instead of the primary tables. This allowed the order data to be compared to "the rules" as they were at the time the check was initiated, without having to lock the whole database. The indexes used in that process were critical to the speed and efficiency of the whole thing. Lots of indexes on dozens of temp tables. Took the proc from several minutes total run-time, down to about 5 seconds, and got rid of numerous lock/block/deadlock issues at the same time.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon