Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««12

Missing Indexes on Temp tables Expand / Collapse
Author
Message
Posted Monday, December 17, 2012 8:34 AM
SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Today @ 4:53 PM
Points: 1,746, Visits: 2,553
TheSQLGuru (12/16/2012)
My statement about making "queries" faster is about the ENTIRE process. The act of populating a temp table with an index already on the temp table before data is added requires not-insignificant work to populate the index(es) as the data is added. This is especially true of the clustered PK scenario. And adding an index after data population requires a full scan of the temp table and the actual effort required to create and store the index. The VAST majority of the times I have seen this done the table is hit ONCE in a join or some other query type and the cumulative cost/duration/effort of adding the index made the entire process run slower.

Gonna stick by my statements here - in my experience most indexes on temp tables do not make for overall faster/more efficient executions (outside of my exception about iterative hits on said temp table where you get an index seek or other significant benefit).



If the data is only being read/used once, why is it being stored in tempdb at all??


SQL DBA,SQL Server MVP('07, '08, '09)
I'm not fat, I'm gravity challenged.
Post #1397284
Posted Monday, December 17, 2012 9:06 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Monday, April 14, 2014 1:34 PM
Points: 15,442, Visits: 9,588
ScottPletcher (12/17/2012)
TheSQLGuru (12/16/2012)
My statement about making "queries" faster is about the ENTIRE process. The act of populating a temp table with an index already on the temp table before data is added requires not-insignificant work to populate the index(es) as the data is added. This is especially true of the clustered PK scenario. And adding an index after data population requires a full scan of the temp table and the actual effort required to create and store the index. The VAST majority of the times I have seen this done the table is hit ONCE in a join or some other query type and the cumulative cost/duration/effort of adding the index made the entire process run slower.

Gonna stick by my statements here - in my experience most indexes on temp tables do not make for overall faster/more efficient executions (outside of my exception about iterative hits on said temp table where you get an index seek or other significant benefit).



If the data is only being read/used once, why is it being stored in tempdb at all??


I've seen plenty of situations where dumping data into a temp table, then using the temp table, is more efficient than complex sub-queries.

You only use the data once, but it splits up the execution of the final operation into smaller chunks, and the optimizer does a much better job on those than on a single mother-of-all-queries type operation.

Basically, a temp table pre-populated, can often outperform complex CTEs, cross-applies, etc., simply by reducing the complexity the optimizer has to deal with.

In those cases, a reasonable clustered index on the temp table can sometimes help, but also sometimes not.

I've gone both ways on temp tables. Some need indexing. Some need a lack of indexing.

For example, I recently wrote an ETL process that includes very complex business rules. The main query has 18 sub-queries, using Cross or Outer Apply, rooted on data in a single table. Two of the sub-queries were so complex that I moved them to temp tables. Just splitting those 2 out of the main query, took total execution time from over an hour to under a minute. Since the process is supposed to run hourly, that's a critical improvement. But they're both technically "temp heaps", since they're completely unindexed. Tried adding a clustered index to each, but it didn't improve anything at all.

On the other side, I once wrote a very complex proc that compared an order to a complex set of "business suggestions", based on heuristic patterns. To simplify the code, and to deal with what would otherwise have been very aggressive locking issues, I used temp tables with some serious indexing, instead of the primary tables. This allowed the order data to be compared to "the rules" as they were at the time the check was initiated, without having to lock the whole database. The indexes used in that process were critical to the speed and efficiency of the whole thing. Lots of indexes on dozens of temp tables. Took the proc from several minutes total run-time, down to about 5 seconds, and got rid of numerous lock/block/deadlock issues at the same time.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #1397307
Posted Tuesday, December 18, 2012 7:31 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 4:46 PM
Points: 35,959, Visits: 30,250
I'll second what GSquared just said.

Sometimes, the optimizer needs a little help. Sometimes that help is to just very quickly isolate a much smaller subset in a temp table (which the optimizer sometimes does on its own, BTW. They're called "Work" tables and you can see them with SET STATISTICS IO ON). It can make all the difference in the world.

Probably the greatest example that I can personally remember was with a year-end query that some folks had written at a previous company. It was so resource intensive that they actually asked me to build a separate instance of the database to run the bloody query because it would paralyze the server for a good 45 minutes.

I got it down to 8 seconds that made only a "pip" on CPU usage by isolating one part of the query to a Temp Table and then joining on that Temp Table. To be sure, the data in the Temp Table was only used once.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

"Change is inevitable. Change for the better is not." -- 04 August 2013
(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1397790
Posted Tuesday, December 18, 2012 8:18 AM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Saturday, August 17, 2013 10:06 AM
Points: 90, Visits: 301
I am the OP here. It appears that the thread became a discsussion for whether indexes are needed on temp tables or not. I can say for sure that the answer is "yes, it is needed", particularly when large volume of data is on the temp tables and joins are done.

Now, if you all can be kind enough to read my original question and help me find a way to make SQL Server find out such missing indexes that would help performance, it will be helpful. I have summaized it again below.

1) Let's say "#t1" is created in session id 100 and #t1 has no indexes on any columns and has 100,000 rows in it.
2) The end user has his "own, custom" stored procedure that works on the data in the #t1 and joins #t1 with another perm. table say "MainData".
3) The temp. table "#t1" and the stored procedure are created by the end user.
4) Whereas the perm. table "MainData" (part of the application) is properly indexed, the "OrderID" column in "#t1" is not indexed which is used in the join with "MainData" and this causes slow performance. If the #t1 is indexed on "OrderID", the performance is great.
5) Now, at the end of the SP execution or before, I would like SQL Server to say, index on "OrderID" column in "#t1" is "recommended".
6) Again, please do note that the temp. table "#t1" is dynamic in nature, and the application will have no knowledge about this and that's the reason the index on "OrderId" can be "pre" created. Therefore, I want the app to suggest to the user that index on "OrderID" column in "#t1" is recommended.

I can make the application run some queries at the end (in the same session id 100 where the #t1 is created) to make the index suggestions to the user.

Any ideas?




Post #1397830
Posted Tuesday, December 18, 2012 8:26 AM


SSC-Forever

SSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-ForeverSSC-Forever

Group: General Forum Members
Last Login: Today @ 12:20 PM
Points: 41,529, Visits: 34,445
I'll just repeat what I said earlier

Even with normal tables, you can't just automatically create whatever the missing index DMVs suggest, unless you want a huge amount of redundant and unnecessary indexes. For temp tables you just need the additional step of saving the exec plan and then checking the missing index info from there (or analysing the plan if the info is not there) instead of the missing index DMVs.


In other words, any form of automatic index creation, automatic index recommendations without testing, is a very bad idea and is more likely to lead to reduced performance overall.



Gail Shaw
Microsoft Certified Master: SQL Server 2008, MVP
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Post #1397839
Posted Tuesday, December 18, 2012 8:42 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Monday, April 14, 2014 1:34 PM
Points: 15,442, Visits: 9,588
ganeshmuthuvelu (12/18/2012)
I am the OP here. It appears that the thread became a discsussion for whether indexes are needed on temp tables or not. I can say for sure that the answer is "yes, it is needed", particularly when large volume of data is on the temp tables and joins are done.

Now, if you all can be kind enough to read my original question and help me find a way to make SQL Server find out such missing indexes that would help performance, it will be helpful. I have summaized it again below.

1) Let's say "#t1" is created in session id 100 and #t1 has no indexes on any columns and has 100,000 rows in it.
2) The end user has his "own, custom" stored procedure that works on the data in the #t1 and joins #t1 with another perm. table say "MainData".
3) The temp. table "#t1" and the stored procedure are created by the end user.
4) Whereas the perm. table "MainData" (part of the application) is properly indexed, the "OrderID" column in "#t1" is not indexed which is used in the join with "MainData" and this causes slow performance. If the #t1 is indexed on "OrderID", the performance is great.
5) Now, at the end of the SP execution or before, I would like SQL Server to say, index on "OrderID" column in "#t1" is "recommended".
6) Again, please do note that the temp. table "#t1" is dynamic in nature, and the application will have no knowledge about this and that's the reason the index on "OrderId" can be "pre" created. Therefore, I want the app to suggest to the user that index on "OrderID" column in "#t1" is recommended.

I can make the application run some queries at the end (in the same session id 100 where the #t1 is created) to make the index suggestions to the user.

Any ideas?

(emphasis added)

Yes. Don't try to do this. At least partially because it's actually impossible.

Automating index creation, especially on temp tables, is going to be overly complex, and is unlikely to accomplish what you need.

First, any index creation on temp tables has to be in the same scope as the creation of the temp table. That means you'd be writing dynamic code to create your dynamic indexes on dynamic temp tables. That, by itself, is going to be a nightmare to write, worse than a nightmare to maintain, and worse than that to document.

Stored procedures can't take user-interaction on-the-fly. They're not application code. You can't pause a proc, tell the user "index blah blah blah is recommended", have them create the index, and then continue on with the proc. They simply don't work that way. Thus, step 5 is actually impossible to do with temp tables.

You could do something like this with global temp tables. But those have horrible problems with name-collision, as well as other issues, and I don't recommend them.

It sounds to me like what you really need is user-bound "perm tables". Just plain old ordinary tables, with a scope that limits them so that only one user has access to them at a time. You could do that by placing them in a user-specific database or user-specific schema within one database. If you want to avoid putting them in normal databases, with the implications for backup, DR, etc., that this carries with it, then create them in tempdb, but create them as persisted tables instead of temp tables.

On the other hand, it might be advantageous to have them in the normal database and DR scope, so that work-interruptions by things like power outages, server crashes, etc., don't necessarily force a user to start over again from scratch. If the tables are going to be as large as you are saying, then it follows that they probably take a fair amount of work to create and use, and having them inside DR scope for point-in-time recovery might be a good thing.

Using persisted tables bypasses your problem with not being able to get index suggestions for temp tables, since you already know how to get index suggestions for regular tables.

It also makes step 5, providing the user with the index suggestions and asking for analysis on them, very easy to do, instead of impossible.

Does that help?



- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #1397851
« Prev Topic | Next Topic »

Add to briefcase ««12

Permissions Expand / Collapse