Temp table for Performance Impact

  • Hi,

    We are using the table variables inside the sp. It is working fine at QA and staging server. When It is executing in Production, it is giving the performance impact.

    But the data populated in table variable is having approx 4000 records only. And the multiple user accessing the same sp from the different screens.

    Shall we change the table variable to #temp table.

    Is it will help to resolve the performance issue.

  • Probably, but can't say for certain without a lot more info. The query, indexes and exec plan would help.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • nitin.varshney (9/27/2012)


    When It is executing in Production, it is giving the performance impact.

    But the data populated in table variable is having approx 4000 records only.

    You need to figure out which SP area is creating issue there might be pother issues like index missing , queries not acc to indexes or vice versa , amount of data. unproper data filters etc..study the execution plan

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Do you have the same amount of data in test as in production? If not, your testing isn't going to find volume based performance issues, nor will you be able to do accurate tuning in test.

    For tuning, first update your statistics and do basic tuning (use nothing you don't need, join to the least number of rows possible using sargable expressions, etc.), then rewrite your query several ways - temp tables, temp tables with indexes, table variables, table variables with UNIQUE constraints and primary keys, derived tables, etc., and watch the CPU, reads, writes, and duration in SQL Profiler. Try other ways of writing it entirely. Watch the execution plan - sometimes SQL Server, even modern versions, chooses completely the wrong index.

    Know enough about your system (or ask someone who does, or find out) to know if you're IO constrained, CPU constrained, or ?? constrained on your production system; weigh that factor more heavily. For instance, on many of my IO constrained systems, I'm delighted to see a 30% CPU increase in exchange for a 5x read decrease, and on those with very light CPU, I'm happy to see a 30% CPU increase in exchange for halving the number of reads.

    Every environment is different.

  • GilaMonster (9/27/2012)


    Probably, but can't say for certain without a lot more info. The query, indexes and exec plan would help.

    Agree, as usual, although I would probably phrase it as "highly likely" instead of "probably". I will add that I DEFAULT to using temporary tables and only switch to table variables in VERY exceptional situations.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (10/2/2012)


    I will add that I DEFAULT to using temporary tables and only switch to table variables in VERY exceptional situations.

    I do the same; I've almost never seen any situation where table variables had a measurable performance improvement. The reverse, however, is not true.

  • When you create your temp table you can define primary keys and indexes as well. This may improve performance.

    Here's a copy of some notes I put together for my developers concerning the use of temp tables (additions and/or corrections are welcomed):

    Always include the following before creating the temp table and then again as soon as the temp table is no longer needed. The temp table will go out of scope by itself when the procedure ends, but if an error occurs when testing before the table goes out of scope it may still exist and need to be manually dropped.

    IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL

    DROP TABLE #TempTable

    Then when creating the table and inserting the data…

    “HEAP” METHOD (no primary key or index). EASY to code and could be used for SMALL temp tables with just a few rows. It shouldn’t be used if the table is ever part of a JOIN since it has no primary key or index. Note that using an ORDER BY on an insert adds unnecessary overhead since SQL chooses the insert order internally and ignores the Order By even though the ordering is executed by the select statement.

    SELECT ProductID, CategoryID

    INTO #TempTable

    FROM dbo.Products

    ORDER BY CategoryID -- Note that a sort here is unnecessary overhead

    “BAD” CLUSTERED INDEX METHOD (manual creation of a named primary key). A MAJOR potential problem with this method is that since the Primary Key is specifically named, if the stored procedure is called concurrently by another process the constraint creation will fail since it already exists in the scope of a different procedure. To avoid this issue the PK name would have to be dynamically created and inserted into a string of dynamic sql text and executed that way (see the very last section where this is done for a non-clustered index). There is an easier way to do the same thing (next section)!

    CREATE TABLE #TempTable'

    (

    [ID] INT IDENTITY(1,1) NOT NULL,

    [ProductID] INT NULL,

    [CategoryID] INT NULL,

    CONSTRAINT [PK_#TempTable'] PRIMARY KEY CLUSTERED

    ([ID] ASC) WITH (PAD_INDEX = OFF

    ,STATISTICS_NORECOMPUTE = OFF

    ,IGNORE_DUP_KEY = OFF

    ,ALLOW_ROW_LOCKS = ON

    ,ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ) ON [PRIMARY]

    INSERT INTO #TempTable

    SELECT ProductID, CategoryID

    FROM dbo.Products

    “GOOD” CLUSTERED INDEX METHOD (auto creation of primary key). This method creates a unique Primary Key (in this example the form [PK__#TempTable__3214EC27582F7143]). The code is cleaner and there is no chance of collisions. If desired, the PK can contain multiple columns just like any index or key like “PRIMARY KEY (ID,ProductID,CategoryID)”. This would be the MOST efficient method since all the columns would be indexed, but of course this could only work if all the columns are [NOT NULL] and the combined values of the columns are always unique.

    Usually it’s not worth the trouble, but if an additional non-clustered index is desired then it should be created AFTER any data is inserted and it must have a unique name to avoid collisions with concurrent runs of the procedure. The reason the non-clustered index should be created after the data insert is because without data there will have to be a single update to the index for every row inserted. By inserting the data first, there only has to be a one-time creation of the non-clustered index—which in net is much faster than individual updates. This is not an issue for a primary key (clustered unique index) because the records are inserted in order and no key lookups or key inserts are required when inserting an initial batch of new records into an empty table.

    CREATE TABLE #TempTable

    (

    [ID] INT IDENTITY(1,1) NOT NULL,

    [ProductID] INT NULL,

    [CategoryID] INT NULL,

    PRIMARY KEY (ID),

    UNIQUE (ID))

    INSERT INTO #TempTable

    SELECT ProductID, CategoryID

    FROM dbo.Products

    /* Optional if additional non-clustered index(s) are desired. */

    /* Note: string values must be NVARCHAR */

    DECLARE

    @strCreateIndex NVARCHAR(1000)

    ,@IndexUniqueID NVARCHAR(50)

    SET @IndexUniqueID = N'IX__#TempTable__'

    + REPLACE(CAST(NEWID() AS NVARCHAR(50)),'-','')

    SET @strCreateIndex = N'CREATE NONCLUSTERED INDEX ['

    + @IndexUniqueID + '] ON [#TempTable] ([CategoryID],[ProductID])'

    EXEC sp_executesql @strCreateIndex

    BTW, Microsoft (since SQL2005) recommends using actual temp tables (#temp) rather than table variables (@temp). The primary reason is that using a #temp table allows normal procedure compilation and caching, whereas table variables (@temp) may force a recompile every time a procedure is run. Also, table variables do not update statistics or the transaction log so they will fail if used in a TRANS/ROLLBACK TRANS scenario. There is a myth that table variables reside in memory and thus improve performance (one of the articles below makes that mistaken assumption). But other articles I’ve read have proved that even temp variables use tempdb and performance tests between the two temp table types show no difference. There are a BUNCH of other reasons listed for not using table variables here:

    http://databases.aspfaq.com/database/should-i-use-a-temp-table-or-a-table-variable.html

    and

    http://blogs.msdn.com/b/sqlcat/archive/2008/10/09/table-variable-vs-temp-table-and-estimated-vs-actual-query-plan.aspx]

    HOWEVER, in spite of all this…only table variables (@temp) can be used inside of functions if a temp table is required. So that is one (perhaps the only) legitimate use of the table variable (@temp) method.

  • 1) What is the point of this (isn't PRIMARY KEY sufficient by itself and the UNIQUE redundant?):

    PRIMARY KEY (ID),

    UNIQUE (ID))

    2) I can probably count on 2 hands the number of times in nearly 20 years of SQL server work where one or more indexes on a temp table have IMPROVED performance of a sproc overall. I cannot count the number of times I have REMOVED them in client's code to make things FASTER.

    3) "The primary reason is that using a #temp table allows normal procedure compilation and caching, whereas table variables (@temp) may force a recompile every time a procedure is run." - incorrect on several points

    4) " table variables do not update statistics" - nothing to do with updating stats, the limitation concerns whether or not they are created

    5) "or the transaction log so they will fail if used in a TRANS/ROLLBACK TRANS scenario" - incorrect. Actually this leads to one of the few scenarios where they are beneficial, namely keeping data in place on rollback to subsequently use for another purpose such as auditing or debugging.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (10/4/2012)


    1) What is the point of this (isn't PRIMARY KEY sufficient by itself and the UNIQUE redundant?):

    PRIMARY KEY (ID),

    UNIQUE (ID))

    2) I can probably count on 2 hands the number of times in nearly 20 years of SQL server work where one or more indexes on a temp table have IMPROVED performance of a sproc overall. I cannot count the number of times I have REMOVED them in client's code to make things FASTER.

    3) "The primary reason is that using a #temp table allows normal procedure compilation and caching, whereas table variables (@temp) may force a recompile every time a procedure is run." - incorrect on several points

    4) " table variables do not update statistics" - nothing to do with updating stats, the limitation concerns whether or not they are created

    5) "or the transaction log so they will fail if used in a TRANS/ROLLBACK TRANS scenario" - incorrect. Actually this leads to one of the few scenarios where they are beneficial, namely keeping data in place on rollback to subsequently use for another purpose such as auditing or debugging.

    Working with big tables in DM environments, it's almost always worth putting an index or two on temp tables. Anything over 10k rows and it's likely to make a measurable difference. Less than 1k rows I never bother. It takes insignificant time to create an index and test.

    6) Note that using an ORDER BY on an insert adds unnecessary overhead since SQL chooses the insert order internally and ignores the Order By even though the ordering is executed by the select statement.

    SQL Server doesn't ignore the ORDER BY - but it will put those rows into the temp table however it sees fit, so to speak. If your running a few tens of rows or more into a temp table and you're going to cluster it, then test with and without ordering by the columns you're going to cluster. Measure the time taken for both statements (the INSERT and the CREATE CLUSTERED INDEX) and choose which works fastest. Using ORDER BY improves performance - quite a lot in some cases - sufficiently often to make this simple test worthwhile.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • >>Working with big tables in DM environments, it's almost always worth putting an index or two on temp tables. Anything over 10k rows and it's likely to make a measurable difference. Less than 1k rows I never bother. It takes insignificant time to create an index and test.

    Your experience is VASTLY different from mine. 99.9x% of the time I see client's use temp tables (or I use them myself) in a sproc the temp table is populated and joined to EXACTLY ONCE in a single query. In this case there is exceptionally few times where an index will a) be useful in the query at all and b) even if it is still result in OVERALL performance improvements over having spent the effort to create the index in the first place. One possible use is when a clustered index allows the optimizer to pick a MERGE join instead of a HASH join on some permanent table, but even here it is quite possible if not likely that the optimizer will simply introduce it's own SORT of the temp table to facilitate that MERGE.

    BTW, I can only think of ONE time where I needed more than one index on a temp table to be most efficient, and that was for an ugly, iterative, cursor-based monster that could not be refactored into a set-based solution. This repeated hit on the temp table during the cursor looping did gain benefits from multiple indexes.

    Sorry, but we will have to agree to disagree on this one.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (10/4/2012)


    >>Working with big tables in DM environments, it's almost always worth putting an index or two on temp tables. Anything over 10k rows and it's likely to make a measurable difference. Less than 1k rows I never bother. It takes insignificant time to create an index and test.

    Your experience is VASTLY different from mine. 99.9x% of the time I see client's use temp tables (or I use them myself) in a sproc the temp table is populated and joined to EXACTLY ONCE in a single query. In this case there is exceptionally few times where an index will a) be useful in the query at all and b) even if it is still result in OVERALL performance improvements over having spent the effort to create the index in the first place. One possible use is when a clustered index allows the optimizer to pick a MERGE join instead of a HASH join on some permanent table, but even here it is quite possible if not likely that the optimizer will simply introduce it's own SORT of the temp table to facilitate that MERGE.

    BTW, I can only think of ONE time where I needed more than one index on a temp table to be most efficient, and that was for an ugly, iterative, cursor-based monster that could not be refactored into a set-based solution. This repeated hit on the temp table during the cursor looping did gain benefits from multiple indexes.

    Sorry, but we will have to agree to disagree on this one.

    Heh - funny how it works out, Kevin. I've never had your experience, of having to remove indexes. Neither experience should be necessary - as I said earlier, the time taken to test all options is usually trivial.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thanks for the corrections/comments everyone. If I get the time I will study them and update my notes on temp tables. Most of that was pulled together from multiple sources and I depend a lot on the expertise of others on such subjects.

    I would like to add something concerning the issue of creating indexes on temp tables. I personally have NEVER found a secondary index on a temp table to be worth the trouble...I find creating a primary key sufficient.

    However, I've had developers create indexes on a temp table before and then see the procedure crash "unexpectedly." (One developer in particular loved to do this.) Thus I outlined a method for creating a truly temporary index that will work. Of course, the wisdom of creating such an index must be challenged and tested!

     

  • SELECT INTO #TEMPTABLE will perform less transaction logging than INSERT INTO #TEMPTABLE, which should be considered when loading 100,000s or millions of rows, but with only 4,000 rows, that aspect of it this case should be a trivial. Confirm if it's not the actual SELECT statement by itself that's performing poorly on production.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • There's only a difference in logging in bulk-logged or simple recovery models.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply