Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Missing Indexes on Temp tables


Missing Indexes on Temp tables

Author
Message
ScottPletcher
ScottPletcher
Hall of Fame
Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)

Group: General Forum Members
Points: 3946 Visits: 6685
TheSQLGuru (12/16/2012)
My statement about making "queries" faster is about the ENTIRE process. The act of populating a temp table with an index already on the temp table before data is added requires not-insignificant work to populate the index(es) as the data is added. This is especially true of the clustered PK scenario. And adding an index after data population requires a full scan of the temp table and the actual effort required to create and store the index. The VAST majority of the times I have seen this done the table is hit ONCE in a join or some other query type and the cumulative cost/duration/effort of adding the index made the entire process run slower.

Gonna stick by my statements here - in my experience most indexes on temp tables do not make for overall faster/more efficient executions (outside of my exception about iterative hits on said temp table where you get an index seek or other significant benefit).



If the data is only being read/used once, why is it being stored in tempdb at all??

SQL DBA,SQL Server MVP('07, '08, '09)

Prosecutor James Blackburn, in closing argument in the "Fatal Vision" murders trial: "If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them."
GSquared
GSquared
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14375 Visits: 9729
ScottPletcher (12/17/2012)
TheSQLGuru (12/16/2012)
My statement about making "queries" faster is about the ENTIRE process. The act of populating a temp table with an index already on the temp table before data is added requires not-insignificant work to populate the index(es) as the data is added. This is especially true of the clustered PK scenario. And adding an index after data population requires a full scan of the temp table and the actual effort required to create and store the index. The VAST majority of the times I have seen this done the table is hit ONCE in a join or some other query type and the cumulative cost/duration/effort of adding the index made the entire process run slower.

Gonna stick by my statements here - in my experience most indexes on temp tables do not make for overall faster/more efficient executions (outside of my exception about iterative hits on said temp table where you get an index seek or other significant benefit).



If the data is only being read/used once, why is it being stored in tempdb at all??


I've seen plenty of situations where dumping data into a temp table, then using the temp table, is more efficient than complex sub-queries.

You only use the data once, but it splits up the execution of the final operation into smaller chunks, and the optimizer does a much better job on those than on a single mother-of-all-queries type operation.

Basically, a temp table pre-populated, can often outperform complex CTEs, cross-applies, etc., simply by reducing the complexity the optimizer has to deal with.

In those cases, a reasonable clustered index on the temp table can sometimes help, but also sometimes not.

I've gone both ways on temp tables. Some need indexing. Some need a lack of indexing.

For example, I recently wrote an ETL process that includes very complex business rules. The main query has 18 sub-queries, using Cross or Outer Apply, rooted on data in a single table. Two of the sub-queries were so complex that I moved them to temp tables. Just splitting those 2 out of the main query, took total execution time from over an hour to under a minute. Since the process is supposed to run hourly, that's a critical improvement. But they're both technically "temp heaps", since they're completely unindexed. Tried adding a clustered index to each, but it didn't improve anything at all.

On the other side, I once wrote a very complex proc that compared an order to a complex set of "business suggestions", based on heuristic patterns. To simplify the code, and to deal with what would otherwise have been very aggressive locking issues, I used temp tables with some serious indexing, instead of the primary tables. This allowed the order data to be compared to "the rules" as they were at the time the check was initiated, without having to lock the whole database. The indexes used in that process were critical to the speed and efficiency of the whole thing. Lots of indexes on dozens of temp tables. Took the proc from several minutes total run-time, down to about 5 seconds, and got rid of numerous lock/block/deadlock issues at the same time.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45119 Visits: 39918
I'll second what GSquared just said.

Sometimes, the optimizer needs a little help. Sometimes that help is to just very quickly isolate a much smaller subset in a temp table (which the optimizer sometimes does on its own, BTW. They're called "Work" tables and you can see them with SET STATISTICS IO ON). It can make all the difference in the world.

Probably the greatest example that I can personally remember was with a year-end query that some folks had written at a previous company. It was so resource intensive that they actually asked me to build a separate instance of the database to run the bloody query because it would paralyze the server for a good 45 minutes.

I got it down to 8 seconds that made only a "pip" on CPU usage by isolating one part of the query to a Temp Table and then joining on that Temp Table. To be sure, the data in the Temp Table was only used once.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
ganeshmuthuvelu
ganeshmuthuvelu
SSC Journeyman
SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)SSC Journeyman (90 reputation)

Group: General Forum Members
Points: 90 Visits: 305
I am the OP here. It appears that the thread became a discsussion for whether indexes are needed on temp tables or not. I can say for sure that the answer is "yes, it is needed", particularly when large volume of data is on the temp tables and joins are done.

Now, if you all can be kind enough to read my original question and help me find a way to make SQL Server find out such missing indexes that would help performance, it will be helpful. I have summaized it again below.

1) Let's say "#t1" is created in session id 100 and #t1 has no indexes on any columns and has 100,000 rows in it.
2) The end user has his "own, custom" stored procedure that works on the data in the #t1 and joins #t1 with another perm. table say "MainData".
3) The temp. table "#t1" and the stored procedure are created by the end user.
4) Whereas the perm. table "MainData" (part of the application) is properly indexed, the "OrderID" column in "#t1" is not indexed which is used in the join with "MainData" and this causes slow performance. If the #t1 is indexed on "OrderID", the performance is great.
5) Now, at the end of the SP execution or before, I would like SQL Server to say, index on "OrderID" column in "#t1" is "recommended".
6) Again, please do note that the temp. table "#t1" is dynamic in nature, and the application will have no knowledge about this and that's the reason the index on "OrderId" can be "pre" created. Therefore, I want the app to suggest to the user that index on "OrderID" column in "#t1" is recommended.

I can make the application run some queries at the end (in the same session id 100 where the #t1 is created) to make the index suggestions to the user.

Any ideas?



GilaMonster
GilaMonster
SSC-Forever
SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)SSC-Forever (47K reputation)

Group: General Forum Members
Points: 47259 Visits: 44392
I'll just repeat what I said earlier

Even with normal tables, you can't just automatically create whatever the missing index DMVs suggest, unless you want a huge amount of redundant and unnecessary indexes. For temp tables you just need the additional step of saving the exec plan and then checking the missing index info from there (or analysing the plan if the info is not there) instead of the missing index DMVs.


In other words, any form of automatic index creation, automatic index recommendations without testing, is a very bad idea and is more likely to lead to reduced performance overall.


Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass


GSquared
GSquared
SSChampion
SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)SSChampion (14K reputation)

Group: General Forum Members
Points: 14375 Visits: 9729
ganeshmuthuvelu (12/18/2012)
I am the OP here. It appears that the thread became a discsussion for whether indexes are needed on temp tables or not. I can say for sure that the answer is "yes, it is needed", particularly when large volume of data is on the temp tables and joins are done.

Now, if you all can be kind enough to read my original question and help me find a way to make SQL Server find out such missing indexes that would help performance, it will be helpful. I have summaized it again below.

1) Let's say "#t1" is created in session id 100 and #t1 has no indexes on any columns and has 100,000 rows in it.
2) The end user has his "own, custom" stored procedure that works on the data in the #t1 and joins #t1 with another perm. table say "MainData".
3) The temp. table "#t1" and the stored procedure are created by the end user.
4) Whereas the perm. table "MainData" (part of the application) is properly indexed, the "OrderID" column in "#t1" is not indexed which is used in the join with "MainData" and this causes slow performance. If the #t1 is indexed on "OrderID", the performance is great.
5) Now, at the end of the SP execution or before, I would like SQL Server to say, index on "OrderID" column in "#t1" is "recommended".
6) Again, please do note that the temp. table "#t1" is dynamic in nature, and the application will have no knowledge about this and that's the reason the index on "OrderId" can be "pre" created. Therefore, I want the app to suggest to the user that index on "OrderID" column in "#t1" is recommended.

I can make the application run some queries at the end (in the same session id 100 where the #t1 is created) to make the index suggestions to the user.

Any ideas?

(emphasis added)

Yes. Don't try to do this. At least partially because it's actually impossible.

Automating index creation, especially on temp tables, is going to be overly complex, and is unlikely to accomplish what you need.

First, any index creation on temp tables has to be in the same scope as the creation of the temp table. That means you'd be writing dynamic code to create your dynamic indexes on dynamic temp tables. That, by itself, is going to be a nightmare to write, worse than a nightmare to maintain, and worse than that to document.

Stored procedures can't take user-interaction on-the-fly. They're not application code. You can't pause a proc, tell the user "index blah blah blah is recommended", have them create the index, and then continue on with the proc. They simply don't work that way. Thus, step 5 is actually impossible to do with temp tables.

You could do something like this with global temp tables. But those have horrible problems with name-collision, as well as other issues, and I don't recommend them.

It sounds to me like what you really need is user-bound "perm tables". Just plain old ordinary tables, with a scope that limits them so that only one user has access to them at a time. You could do that by placing them in a user-specific database or user-specific schema within one database. If you want to avoid putting them in normal databases, with the implications for backup, DR, etc., that this carries with it, then create them in tempdb, but create them as persisted tables instead of temp tables.

On the other hand, it might be advantageous to have them in the normal database and DR scope, so that work-interruptions by things like power outages, server crashes, etc., don't necessarily force a user to start over again from scratch. If the tables are going to be as large as you are saying, then it follows that they probably take a fair amount of work to create and use, and having them inside DR scope for point-in-time recovery might be a good thing.

Using persisted tables bypasses your problem with not being able to get index suggestions for temp tables, since you already know how to get index suggestions for regular tables.

It also makes step 5, providing the user with the index suggestions and asking for analysis on them, very easy to do, instead of impossible.

Does that help?

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search