Temp Tables in SQL Server

  • Hi Dinesh, (Sorry if this comes through twice, I posted but it seems to have got somewhere? )

    Nice simple article, well done.  Just one thing which is a misconception with table variables.  That is they are not stored in memory at all.  Actually they are created in the tempdb jus the same way that temp tables.  There are lots of articles around say they are stored in memory but it's just not true.  A simple reason for this is explained by this example.  Say you have a small SQL Server with 1GB of RAM.  You create a table variable and insert 2GB of data into it.  If table variable were stored in memory this would cause lots of issues.  A simple way of see that table variables are treated as temp table is to do the following.

    SET STATISTICS IO on

    begin

    declare @lognames table (a int not null identity (1, 1) primary key, b varchar(255) not null, c int not null, d varchar(100) )

    insert into @lognames(b,c,d) values('1',2,'3')

    select * from @lognames

    end

    SET STATISTICS IO off

    You will see that you have a table #something.  So table variables are not stored in memory they are a temp table .  It doesn't state anywhere (I can find) in BOL that they are stored in memory.

    Anyway well done

    Kind regards

    Greg O

  • quoteYou will see that you have a table #something. So table variables are not stored in memory they are a temp table . It doesn't state anywhere (I can find) in BOL that they are stored in memory.

    This has been discussed in other threads and I believe there are articles on MS site that refer to this. When you create a table variable, SQL will create a #table in tempdb but may or may not use it. It all depends on how much data and whether SQL has space for it in memory. It could start of in memory but at any time be transferred to tempdb.

    Far away is close at hand in the images of elsewhere.
    Anon.

  • David,

    In refrence to your question about Global tables and their scope.  A global table with be deleted when all connections refrencing it have been terminated.  So if you create a global table via a stored procedure and someone accesses that table either by executing the same stored procedure or another they will be accessing the data you created.  If your stored procedure were to complete the table would not be deleted until everyone accessing the data closed their connection. 

    I have yet to find a case where I wanted temporary data accessed by anything other than the process for which the data was created.  If the data needs to be accessed by other processes I simply create a real table and if it is really just a work table that can be deleted I will prefix it with something like tmp or wrk. 

    Hope that made sense.....

  • Both #temp tables and table variables may or may not be created in memory or on disk - it's a matter of available resource.

    Your use of one type of table or another largely relates to your requirements - you should be aware of the limitations and advantages of each.  

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Table Variables are inferior to Temp Tables in performance except in cases where performance won't really be an issue (small tables) . 

    Table variables are great for small things, but if you get to large datasets a temp table will blow it away for performance.  Why?  I don't know, but having used them a lot, I know that's how it is.

     

  • Yes, just use create index.  Index creation on a temp table is identical to a normal table, and it can speed joins and processing of rows immensely just like putting an index on a real table.

  • At least in the past (I'm not sure this is still an issue, though), SQL Server would lock system table rows while the select was being performed.  The rows were not freed up until after the select was performed and the  temp table was created.  This caused huge contention/deadlock issues in many systems.

    If the table is going to be small, I see no reason not to use select into, but if you are pulling from big tables and/or producing a large dataset such that the query will not execute almost instantly, you are better off creating the table table first.

  • You should never user select into . During the population of the temporary table by the select into statement the allocation block in the data file is locked - so if your population takes a long time you may suffer blocking. You can avoid the issue on a multi proc box by creating a datafile ( not filegroup ) for each processor on your server.

    Best practice is not to use Select Into - I have found numerous production issues caused by this.

    Table variables are not subject to statistics thus the optimiser can only make best guess - although you can put a PK clustered index on a table variable it doesn't actually do much other than sort data.

    Watch the use of table variables within joins, the optimiser does not always choose the best join strategy whcih can degrade performance significantly.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • I am very curious about your  statement as to never use the select into... I will have to do some testing to verify as I have experienced otherwise, and have recommended it to some developers due to the lack of logging.  When using a VLDB9 ( hundreds of millions of rows ) log space becomes a large factor.  What kind of production issues have you seen?

    Thanks,

  • I just had to perform a quick test.  Although I don't have the time to do the test on a VLDB which does run on a multi-proc beefy machine.  I ran it on a tiny little sandbox machine and cleared the buffers and caches before each test in order to provide more true results based on a single test.

    Number of rows = 380,013 to be inserted into the temp table.

    Create the temp table and then insert into it = 13 seconds.

    Utilize a SELECT INTO TO create it = 4 seconds.  And that was after I had to free up some space for the log

  • I also think that saying never use select into is extreme. It has good qualities as expressed above in terms of log space and data migrations. It also can be used with an always false where predicate to copy table structures etc. You just need to know when you should and when you should not use select into... but that is probably another article in it self

    Just my $0.02

     


    * Noel

  • select top 0 *

    into temptable

    from table

    Insert temptable

    select *

    from table

    Not that I'm saying to do this...the only reason to do "select into" is if you are selecting all the columns in a table, which is something you should never do with temp tables.

    I've got this great replacement for temporary tables...they're called...TABLES!

    That said, locking problems are a good reason to use temp tables. It would be even cooler if this functionality was not tied to putting the data in tempdb, as that causes it's own resource locking issues.

    It would be awesome to write against a table in a procedure and be able to specify that table as being "private". If a table was private an instance of it would be created for the procedure every time it was executed, and the table instance would never be shared by another instance of the same procedure.

    Passing this table to the calling procedure would also be good...something like "internal".

    The correlation to C# access modifiers is not an accident.

    Signature is NULL

  • select into can cause  locking in a multi-user environment this has and still is clearly documented.

     I don't want to get into silly arguments but don't throw VLDB into a forum,  if you have a terrabyte database say so, or if you're moving hundreds of millions of rows say so, the term VLDB is largely meaningless without definition, sorry.

    There are a number of Microsoft white papers which cover temporary tables, table variables, tempdb locking and contention, from my experience performance tuning oltp and BI systems removing select into has always produced positive gains.

    I didn't wish to stir up a heated debate - I just observed that there were a number of inaccurate statements and deductions in the original article. I'll say no more !!

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • I appologize if I repeat some of the previous posts but I think they might be worth repeating... and I believe John Marks referenced the same URL that I do below...

    quote

    "table" is created in memory unlike the temporary table which will create in tempdb, which is obviously much faster.

    Absolutely not true.  Both Temp Tables and Table Variables start out in memory and both spill into TempDB when they get too big for memory.  But, don't take my word for it... the following Microsoft URL...

    http://support.microsoft.com/default.aspx?scid=kb;en-us;305977&Product=sql2k

    ... has this to say about that particular subject...

    Q4: Are table variables memory-only structures that are assured better performance as compared to temporary or permanent tables, because they are maintained in a database that resides on the physical disk?

    A4: A table variable is not a memory-only structure. Because a table variable might hold more data than can fit in memory, it has to have a place on disk to store data. Table variables are created in the tempdb database similar to temporary tables. If memory is available, both table variables and temporary tables are created and processed while in memory (data cache).

    quote

    In SQL Server 2000, new data type called "table" offers same properties as temporary tables.

    Not quite!  Here's what the previously listed URL says about that, the most important of which is the note on statistics usage...

    Q3: What are some of the drawbacks of table variables?

    A3: These are some of the drawbacks as compared to temporary tables:

    Non-clustered indexes cannot be created on table variables, other than the system indexes that are created for a PRIMARY or UNIQUE constraint. That can influence the query performance when compared to a temporary table with non-clustered indexes.
    Table variables do not maintain statistics like temporary tables can. Statistics cannot be created on table variables through automatic creation or by using the CREATE STATISTICS statement. Therefore, for complex queries on large tables, the lack of statistics may deter the optimizer to determine the best plan for a query, thus affecting the performance of that query.
    The table definition cannot be changed after the initial DECLARE statement.
    Tables variables cannot be used in a INSERT EXEC or SELECT INTO statement.
    CHECK constraints, DEFAULT values, and computed columns in the table type declaration cannot call user-defined functions.
    You cannot use the EXEC statement or the sp_executesql stored procedure to run a dynamic SQL Server query that refers a table variable, if the table variable was created outside the EXEC statement or the sp_executesql stored procedure. Because table variables can be referenced in their local scope only, an EXEC statement and a sp_executesql stored procedure would be outside the scope of the table variable. However, you can create the table variable and perform all processing inside the EXEC statement or the sp_executesql stored procedure because then the table variables local scope is in the EXEC statement or the sp_executesql stored procedure.

    quote

    I've got this great replacement for temporary tables...they're called...TABLES!

    Right on Calvin... if you have the need to create millions of rows in a temporary structure, then it must be for a batch job and you really need to rewrite it to use a permanent working table.  If it's for a GUI, then you really need to take a very close look at how you're writing the code... there should be no need for temporary structures in the millions of rows for GUI code.

    quote

    select into can cause  locking in a multi-user environment this has and still is clearly documented.

    Have you a Microsoft URL that states/explains that?  I sure would like to see it.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Never mind... here's the answer to that last question in my previous post...

    http://support.microsoft.com/kb/153441/EN-US/

    ... and make sure that you read the part that says ...

    NOTE: This problem does not apply to SQL Server 7.0 and later.

    To summarize, YES, the use of SELECT INTO used to cause blocking problems... WAY back in version 6.5.  That, according to the URL above, is no longer the case and I have to agree... we have several batch processes (I didn't write them) that do a SELECT INTO into temp tables for more than a million records and because of the complexity of the joins (again, I didn't write the buggers), they take in excess of ten minutes to complete.  If they did lock temp db, then the database would appear to be frozen for those 10 minutes and that's patently not the case.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 45 total)

You must be logged in to reply to this topic. Login to reply