Is a Temporary Table Really Necessary?

  • top reason to use select into is to avoid logging...that makes it very fast

    Use global temporary tables and outside transactions you will never lock tempdb, no contention issue....

    Its using normal temp tables (#) with in transaction which lock tempdb

    Prakash Heda
    Lead DBA Team - www.sqlfeatures.com
    Video sessions on Performance Tuning and SQL 2012 HA

  • Sorry about my CTE Post.

    I did not see that both jmapledoram & SQL Noob had already posted this as an alternative.

    matteo

  • OK decide to see what happens when I run the code for the same result each time in the values using a few different methods and here is what I found.

    Example 1: Using Table Variable

    Runtime: 8+ minutes

    declare @t table

    (OrderID int,

    RequiredDate datetime not null,

    ShippedDate datetime null)

    set nocount on

    declare @x int,

    @job datetime

    set @job = getdate()

    set @x = 0

    while @x < 1000

    begin

    DBCC FREEPROCCACHE

    DBCC DROPCLEANBUFFERS

    delete from @t

    insert into @t select OrderID, RequiredDate, ShippedDate from Orders

    set @x = @x + 1

    end

    select @job, getdate()

    Example 2: Using Temp Table w/Insert Into and Delete From Method

    Runtime: 2.5 minutes

    create table #text

    (OrderID int,

    RequiredDate datetime not null,

    ShippedDate datetime null)

    set nocount on

    declare @x int,

    @job datetime

    set @job = getdate()

    set @x = 0

    while @x < 1000

    begin

    DBCC FREEPROCCACHE

    DBCC DROPCLEANBUFFERS

    delete from #text

    insert into #text select OrderID, RequiredDate, ShippedDate from Orders

    set @x = @x + 1

    end

    select @job, getdate()

    go

    drop table #text

    go

    Example 3: Using Temp Table w/Insert Into and Truncate Table Method

    Runtime: 1.25 minutes

    create table #text

    (OrderID int,

    RequiredDate datetime not null,

    ShippedDate datetime null)

    set nocount on

    declare @x int,

    @job datetime

    set @job = getdate()

    set @x = 0

    while @x < 1000

    begin

    DBCC FREEPROCCACHE

    DBCC DROPCLEANBUFFERS

    truncate table #text

    insert into #text select OrderID, RequiredDate, ShippedDate from Orders

    set @x = @x + 1

    end

    select @job, getdate()

    go

    drop table #text

    go

    Example 4: Using Temp Table w/Select Into and Drop Table Method

    Runtime: 6-12 seconds

    set nocount on

    declare @x int,

    @job datetime

    set @job = getdate()

    set @x = 0

    while @x < 1000

    begin

    DBCC FREEPROCCACHE

    DBCC DROPCLEANBUFFERS

    select OrderID, RequiredDate, ShippedDate into #text from Orders

    set @x = @x + 1

    drop table #text

    end

    select @job, getdate()

    go

    I ran each item 5 times. Of course the examples are designed to try and stress each query by taking any advantages away.

    And the comment that "SELECT INTO isn't logged" is incorrect. I did check seperately to be and found like all objects stored in TempDB the tempdb log reflected the inserts so it is logged.

    Anyway, if anyone has an example where a Table Variable performs better than a Temp Table with regards to Northwind or other available database data please let me know so I can strip it down to see if there is anything that says why and when one may be better than another.

    As for the above examples when you shorten the cycle from 1000 to say 10 they all run about the same, at 100 they have a little more noticeable difference. And when you remove FREEPROCCACHE and DROPCLEANBUFFERS the longer running items do pickup quite a bit of speed.

  • Simon Villiers (6/5/2006)


    By "orphaned temp tables" I mean temp tables in tempdb that no longer have any connecting processes/sessions/etc. I know they are dropped when the connection terminates, and when I first saw this in the test environment I assumed it was a leftover from development in Query Analyzer or similar.

    When I saw this in the production environment I got an inkling something was amiss. Checking with the DBA revealed that there had been instances where temp tables remained after the end of the connection, and that there was a job to check for this. This was not a daily occurence, mind, just something that was done when tempdb became a little bloated; one of the things to check.

    I haven't seen this myself, and am quite curious to do so coz this shouldn't happen. This is about the extent of my knowledge on this situation. If you'd like more info, drop me an email and I'll see if I can dig something up.

    S.

    On our busy production servers we see a build up of what you call "orphaned" temp tables (sql 2005). These are dropped temp tables that have not been cleaned up yet. In fact, there is a counter called Temp Tables for Destruction under the SQL Server:General Statistics object that will show how many tables are out there for the system cleanup thread to dispose of. But to my knowledge, there is no way to explicitly drop these objects without a restart of sql. If your DBA has a way of doing an explicit drop, PLEASE let me know. This has caused us grief ever since moving to sql 2005 and has forced us to do frequent (sometimes nightly) reboots to avoid a degradation of performance. (I've posted this issue to this forum several months ago but did not get much feedback). We are coming to the conclusion that the temp table count is related to 1) system memory and 2) the number of databases on an instance. We typically have between 300 and 500 databases per instance.

    Any help would be appreciated.

    Regards,
    Rubes

  • I guess in 90% of all cases where table variables / temp tables are applied, there is probably a better solution available like:

    - Rewriting the queries (fix bad access paths, remove functions around indexed fields)

    - Updating statistics

    - Using proper indexes (proper type, proper fields, proper index size)

    - Avoiding too many columns in one table (you could take it to the extreme:

    - Avoiding non-set based operations

    - Horizontal Partitioning

    - etc. etc.

    Thats at least the experience with my queries from the past ๐Ÿ˜‰

    I would be interested in others opinions regarding this!

    I also agree with some of the posters that mentioned that there is no such thing as black and white for these things. It's always a bit of this and a bit of that. You never can avoid the work of weighing the benefits against the drawbacks if you want to make a well reasoned decision. It will never be the perfect decision because as mentioned above, it is always a bit of this (good) and a bit of that (bad).

    But in order to make a good decision, you need to know all possible solutions to a problem and their characteristics. Otherwise you are just guessing and probably making the wrong guess.

    Ok, time for me to go to bed ๐Ÿ˜€

    Best Regards,

    Chris Bรผttner

  • jmapledoram (5/30/2008)


    But correct me if I'm wrong, doesn't a Table Variable get stored in ram rather than the tempdb? You will want to consult your DBA before writting a lot of these. You could just be transfering work load from one area to antoher without the hardware to support it.

    Yes and no... if a table variable fits in memory, then it does just like a derived table does... it uses memory. If it doesn't fit in memory, it will use TempDB, just like a derived table does.

    Now, here's the surprise for some folks... where does a Temp Table live? If you said TempDB, you're only half right because, just like a Table Variable, if it fit's in memory, it lives in memory. Both will have an "entry" in TempDB and a derived table can appear as a "work" table in TempDB... same holds true for CTE's.

    Don't take my word for how both Table Variables and Temp Tables live in either memory or TempDB depending on their size though... I've posted it several times before on this very thread... if you haven't done so already, PLEASE read Q3/A3 and Q4/A4 in the following URL...

    http://support.microsoft.com/default.aspx?scid=kb;en-us;305977&Product=sql2k

    For those that think SELECT/INTO isn't logged in a database with a SIMPLE recovery mode... EVERY action in a database is logged... SELECT/INTO, if it meets certain conditions, is MINIMALLY logged and will still blow away an INSERT into a new table even if the database is in the FULL recovery mode.

    Again... don't take my word for it... run the following in each recovery mode... notice the number of logical reads on the INSERT INTO example...

    --===== Create and populate a 10,000 row test table.

    -- This is the "controlled" source for further tests

    SELECT TOP 10000

    SomeID = IDENTITY(INT,1,1),

    SomeInt = ABS(CHECKSUM(NEWID()))%50000+1,

    SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    + CHAR(ABS(CHECKSUM(NEWID()))%26+65)

    INTO dbo.JBMTest

    FROM Master.dbo.SysColumns t1,

    Master.dbo.SysColumns t2 --Lack of join criteria makes this a CROSS-JOIN

    --===== A table is not properly formed unless a Primary Key has been assigned

    -- Takes about 1 second to execute.

    ALTER TABLE dbo.JBMTest

    ADD PRIMARY KEY CLUSTERED (SomeID)

    GO

    ----------------------------------------------------------------------------------------

    -- The INSERT/INTO test

    ----------------------------------------------------------------------------------------

    --===== Create a table for the Insert/Into test

    CREATE TABLE Table1 (SomeID INT,SomeInt INT, SomeLetters2 CHAR(2))

    --===== Do the test while capturing some statics information

    SET NOCOUNT ON

    PRINT '===== INSERT INTO ====='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    INSERT INTO Table1

    SELECT * FROM dbo.JBMTest

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',78)

    GO

    ----------------------------------------------------------------------------------------

    -- The SELECT/INTO test

    ----------------------------------------------------------------------------------------

    --===== Do the test while capturing some statics information

    SET NOCOUNT ON

    PRINT '===== SELECT INTO ====='

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    SELECT *

    INTO Table2

    FROM dbo.JBMTEST

    SET STATISTICS TIME OFF

    SET STATISTICS IO OFF

    PRINT REPLICATE('=',78)

    GO

    --===== Simple house keeping

    DROP TABLE dbo.Table1, dbo.Table2, dbo.JBMTest

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • pduplessis (5/29/2008)


    I agree that in most cases you can avoid the use of temp tables by structuring your query slightly differently, but for sure, sometimes they are unavoidable.

    [snip]...

    Firstly, ANYTHING going into a where statement, I queried the dimensions first and got the respective keys for (example, only certain products are deemed to be "activating" products). Popped these into a temporary products table, which I now could use in an inner join on my monster fact

    [snip]...

    By forcing the temporary table in effect you are saying before you merge join hash/match whatever, that creating a single set of index seeks to produce a subset of records to join in a query can be more performant than a repeating series of clustered or even covering non-clustered index seeks each time the subset is referenced. Even though the number of index pages scanned is logarithmically related to the cardinality of the table it can still be non-trivial for large tables. Copying a small amount of data into a new table at some point is more efficient that certain amount of index seeks.

    Or in other words I know that there are a couple of (generally key always indexed) columns in this table and I know that if I restrict my working table to these before I start evaluating the rest of the query I can eliminate a whole pile of index page reads because the smaller number of rows I am interested in can be pre-fetched and much easier to index and join over as it is a much smaller range than via the index on the much large entire table.

    But the converse where the predicate is not a good indicator of selectivity (ie accounts < 100,000) the pre restriction wont help because the subset wont be much smaller than the original table (if indeed there are 110,000 accounts) and creating an indexed temporary table will be a waste of effort. For a given database design,row cardinality and list of selection predicates you can tell whether this method is going to be a benefit or not.

    Its forcing the data access design back into procedural programming (well 2 steps) away from a logical set definition of SQL (ie desired result and leaving the determination of the best method for evaluation to the query optimiser).

    The fact that this in practice this is necessary is a pity but I guess until you get large sets it gives no benefit and a DB never really goes back to being smaller. Whether this method is better is really a property of how selective a predicate is on an index eg: account = 123456789 on the product_sales table.

  • Nice article......

  • one of our analysts sent me a bunch of queries they run daily for some reports. each one is a select into a temp table, then another select into another temp table and joining data from the previous temp table and other tables.

    didn't feel like writing a crazy query with a bunch of where statements so i took the lazy way out. i created views for each of the temp tables. and running a select from a view calls the 2 other views automatically.

  • :ermm: Out of the blue... off the cuff remark...

    Wouldn't it be nice to create a temp table and be able to specify the database in which is created ๐Ÿ˜€

    Far away is close at hand in the images of elsewhere.
    Anon.

  • what would be the point?

  • Might be good if you had multiple filegroups and multiple temp tables you were gonna join?;)

  • CREATE TABLE [temporaryTable]...

    LOL

    Atlantis Interactive - SQL Server Tools
    My blog[/url]
    Why I wrote a sql query analyzer clone

  • Control disk space and where it is used

    Reduce interference from other apps/databases ie locks

    Minimise effect of rogue apps/databases filling tempdb

    I did state ir was off the cuff.... didn't say it was a perfect idea :w00t:

    Far away is close at hand in the images of elsewhere.
    Anon.

  • you can always put tempdb on multiple spindles using multiple files

    I think MS did it this way because it would be a nightmare if apps created temp tables in user databases that couldn't be deleted until the next restart. this way all the junk is in the one database you rarely look into

Viewing 15 posts - 76 through 90 (of 93 total)

You must be logged in to reply to this topic. Login to reply