Is a Temporary Table Really Necessary?

  • Code below illustrates one other difference between temp tables and table variables. That that dynamic sql can see temp tables created by calling script but cant see table variables. Presumably this reveals some substantive difference between the way they are treated behind the scenes.

    -----Temp Table

    create table #test (id int identity, test nvarchar(20))

    declare @sql nvarchar(4000)

    select @sql ='insert #test (test) select ''ohdear'''

    print @sql

    exec sp_executesql @sql

    select * from #test

    ------Table variable

    declare @tab table (id int identity, test nvarchar(20))

    declare @sql2 nvarchar(4000)

    select @sql2 ='insert @tab (test) select ''ohdear'''

    print @sql2

    exec sp_executesql @sql2

    select * from @tab

    www.sql-library.com[/url]

  • Cool, that solves my problem.  I must been doing something wrong in my syntax before.

  • Good article, and a great discussion. But let's not forget that performance should not be the only criterion - your code needs to be maintainable and extensible. Using temp tables allows you to break your code into blocks that can be tested & debugged more easily, and allows your successors to understand what you were trying to accomplish.

    For example, when writing an sp you can search & replace "#" with "xyz" to create actual tables, and then run portions of the code to verify that it's really doing what you expect. Once everything works, change "xyz" back to "#" and you're done.

    There are certainly times when subqueries & the other techniques discussed are required for performance reasons, but that makes your code significantly more difficult to work with, when (not if) schemas or business rules change.

     

  • In the article there's a link to an article on the authros site (http://www.transactsql.com).  The link doesn'yt work for me; is it working for anyone else?  ANyone know if something has happened to the site?

     

    Thanks

    Ed

    Kindest Regards,

    Just say No to Facebook!
  • It has to do with the narrower scope of variables vs. temp tables.  Temp tables are available for all batches in a connection, variables are scoped to the batch.  Therefore dynamic sql cannot utilize a table variable declared outside that batch.  This is also why you cannot populate a table variable with the execution of a stored procedure...

    The other major difference (which has been mentioned, but which I'll repeat here) is that table variables are not affected by trasactions.  This can be useful, but it can also cause problems for those not aware of this limitation/difference.

    /*****************

    If most people are not willing to see the difficulty, this is mainly because, consciously or unconsciously, they assume that it will be they who will settle these questions for the others, and because they are convinced of their own capacity to do this. -Friedrich August von Hayek

    *****************/

  • yeah i would use the same

  • hey how to use temp variables?

  • Nicely said and I agree 100%!  Only thing I wouldn't do is change the Temp Tables back to Table Variables... Q3 and Q4 in the following link are the reason why...

    http://support.microsoft.com/default.aspx?scid=kb;en-us;305977&Product=sql2k

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The document prefers to have temp table created first and then push data using Insert into Select query. I would prefer it to have Select into..from quety for creating temp tables simply because I am gaining performace by logging minimum information. m I correct in saying so?   

     

     

    create table #some_temp_table(name sysname,info smallint)

    create clustered index some_temp_table_CLI on #some_temp_table(name)

    insert into #some_temp_table

    select top 10000 s1.name,s1.info 

    from sysobjects s1 , sysobjects s2

    drop table #some_temp_table

    go

     

    select top 10000 cast(s1.name as sysname) as name,

                     cast(s1.info as smallint) as info

    into  #some_temp_table

    from sysobjects s1 , sysobjects s2

    create clustered index some_temp_table_CLI on #some_temp_table(name)

    drop table #some_temp_table

    Regards
    Shrikant Kulkarni

  • I agree that in most cases you can avoid the use of temp tables by structuring your query slightly differently, but for sure, sometimes they are unavoidable.

    One such a case I had was not so long ago that springs to mind.

    One particular area was bleeding severely as their batch seemed to be dying several deaths on a specific job, so the manager (armed with a case of tequila - this wasnt the area I was reporting to and this was for sure a huge favor) asked me to look at why the performance of a particular query in the job was killing the environment. When I took this query and had a look at the underlying data, I asked him to change that case to perhaps the entire bottlestore. The query was taking more than 36 hours to execute.

    Basically, they had a fact table with 2.5 billion rows of transactional data. They were attempting to calculate the first "activating transaction" on an account, as this would trigger some sale incentivisation to occur. When the query was built initially (no doubt on a subset of data significantly smaller), nobody actually had some foresight to realise that this fact table would grow with between 5 and 10 million rows a day. Therefore, they did a simple join and update statement, aggregating at the same time. They actually did think of a proper indexing strategy, as anything and everything which required some indexing was done in a proper way on the monster fact as well as the surrounding dimensions.

    To solve this was actually one of the easier ones, and definitely one where I lived by the use of temporary tables.

    Firstly, ANYTHING going into a where statement, I queried the dimensions first and got the respective keys for (example, only certain products are deemed to be "activating" products). Popped these into a temporary products table, which I now could use in an inner join on my monster fact

    Secondly, this query was applicable ONLY to accounts where the "activation" had not taken place before. Therefore, another temporary table holding account keys would enable a join on account key.

    Thirdly, aggregating while joining on 2 billion rows of data is most probably the single biggest reason for the performance bottleneck. So after joining to the transactional fact with my temp tables, I would write a "mini fact" containing only transactions which should be looked at into another temporary table. Therefore, the aggregation is reduced from 2.5 billion to a few thousand, and subsequent to this, the aggregation takes a few milliseconds to complete. Even this aggregation gets stored into a temporary table. Why? Updating while aggregating may cause a further performance bottleneck. The update is now done against my accounts on a 1:1 basis, making it equally as fast.

    When I finalised the optimisation of that batch, it took between 5 and 10 minutes to achieve the same thing as the 36 hour statement. Later on, I partitioned the transactional fact and added some archiving strategy which reduced the query to seconds. But that is irellevant for this posting. What really juiced the performance was a combination of temporary tables and some sound business logic.

    It is indeed about whatever works best for a particular situation, but I have to stress, temp tables is a last resort

  • No mention was made of Table variables, whenever I need to use a temporary table I try to use a table variable first. I have found significant performance improvements in obtaining a sub set of a large file into a Table variable and using the table varibale for results. Unforunatley you cannot use them to store results from a stored procedure, if they get that right I never see the need to use a # or ## table.

  • hello everybody !

    i have read this article but it is quite mysterious why using temp tb you get the data faster.

    any idea ?

  • Eric Inman (6/2/2006)


    "Table Variables are more performant than temp tables because they don't use tempdb."

    From my experience I have seen table variables still built into tempdb. Do some simple testing and you will see the same results. You are accurate as Table variables do seem to perform better than classic table vars.

    Agreed! Table Vars use tempdb once they grow to over a certain size....

  • I found temp tables very handy to work around long running queries ad I learned how to use them to cut down complex problems, but there are some things you always have to keep in mind when using temp tables:

    1) temp tables are stored in tempdb and table variables are usually kept in memory: don't use table variables when the rows to store exceeds around 1000 records.

    2) always provide your temp table with some indexes or the only possible plan would be a table scan. If you decide to use a clustered index, insert data with that order.

    3) drop temp tables when you don't need them any more to avoid tempd uncontrolled growth

    Just to say a word aboud global temp tables, I found them handy when I had to "cache" the results of a complex query used by a stored procedure without creating a phisical table to hold the cache.

    -- Gianluca Sartori

  • Ok - bit of myth busting

    1) Table Variables outperform Temporary Tables

    This is not necessarily true. You can't create an index on a table variable, and if you are dealing with large data sets, then a temporary table with reasonable clustered indexing will totally trash a table variable.

    2) Derived tables avoid using tempdb

    Again not necessarily true. Derived tables quite often end up as a system generated temporary table in tempdb, this is done by the query processor if it decides it needs to (although i've never looked into it enough to find out why).

    Just really goes to show - the only way to find the best performance is to test, test and test again...

    Atlantis Interactive - SQL Server Tools
    My blog[/url]
    Why I wrote a sql query analyzer clone

Viewing 15 posts - 31 through 45 (of 93 total)

You must be logged in to reply to this topic. Login to reply