Comparing Table Variables with Temporary Tables

  • Nick Beagley (6/10/2009)


    There's also the issue that you won't get a parallel query plan if you modify a table variable, but temp tables are fine:

    From SQL2008 BOL: http://msdn.microsoft.com/en-us/library/ms175010.aspx

    Queries that modify table variables do not generate parallel query execution plans. Performance can be affected when very large table variables, or table variables in complex queries, are modified. In these situations, consider using temporary tables instead. For more information, see CREATE TABLE (Transact-SQL). Queries that read table variables without modifying them can still be parallelized.

    Thanks for this link... I'll add this to the article.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Excellent Article, Wayne! A much needed treatment of the commonly misunderstood subject. It's already in my briefcase... 🙂

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Great article! Very insightful and will definitely help with current and future developments. Ciao.

  • Tim Walker (6/10/2009)


    Question: (and I wouldn't have even asked it if I hadn't read this first) is what is the point in defining an index on a table variable if it has no statistics? I'm assuming that if the index is unique this will optimise well because of the 'one row in a table variable' technique but otherwise this won't help will it?

    It may help. Because there are no stats the optimiser will always think that any operation against the table variable will return one row (unique or non-unique index). That's a prime opportunity (as far as the optimiser's concerned) for an index seek/bookmark lookup. The problem is, if there's actually a lot of rows returned, that bookmark lookup will make the operation more expensive than if the table had been scanned.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Gail, thank you for jumping in and answering this for me. You knowledge of how indexes work far surpasses mine.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • For David McKinney

    Just be wary when using a main procedure to prepare your temp tables

    followed by sub procedures to populate the temp tables

    The article notes that temp tables can be created with a create table statement or

    a select into statement but, there is (I think) a gotcha here. If a temp

    table exists or is in scope and you use the 'select into statement', then you

    will generate a seperate temp table that may be unique in the tempdb but

    it will not be possible to distiguish between the two in your procedure.

    Maybe this is an obvious outcome because select ... into syntax requires that the destination table does'nt exist and perhaps it was just poor programming

    (on my part!) that got me.

    You will see from the below that test_sp2 creates its own in scope #temptable

    using the select into syntax and this 'version' of #temptable is not the

    same as the test_sp1 #temptable

    Thanks for the excellent comparison article, Wayne.

    -- Create a physical table for some sample data

    create table dbo.realtable (col1 char(4) null,col2 char(4) null)

    GO

    -- Populate some sample data

    insert dbo.realtable(col1,col2)

    select '1000','test'

    union all

    select '2000','rest'

    union all

    select '3000','best'

    GO

    -- Create a stored procedure to

    -- 1. create an empty temporary table based on the physical

    -- 2. call two example procedures which will populate the

    -- 'in scope' temporary table

    create procedure test_sp1

    as

    begin

    select * into #temptable from dbo.realtable where 1=2

    exec test_sp2 '1000','test'

    select 'After call to sp2: ' msg,* from #temptable

    exec test_sp3 '2000','rest'

    select 'After call to sp3: ' msg,* from #temptable

    end

    GO

    -- Procedure to populate temp table using "select ... into"

    create procedure test_sp2

    @param1 char(4),@param2 char(4)

    as

    begin

    select col1,col2 into #temptable from dbo.realtable

    where col1 = @param1 and col2 = @param2

    select 'Inside call to sp2: ' msg, * from #temptable

    end

    GO

    -- Procedure to populate temp table using "insert ... select"

    create procedure test_sp3

    @param1 char(4),@param2 char(4)

    as

    begin

    insert #temptable select col1,col2 from dbo.realtable

    where col1 = @param1 and col2 = @param2

    select 'Inside call to sp3: ' msg, * from #temptable

    end

    GO

    -- Call the main stored procedure

    exec test_sp1

  • twillcomp (6/11/2009)


    For David McKinney

    Just be wary when using a main procedure to prepare your temp tables

    followed by sub procedures to populate the temp tables

    Thanks for pointing this out and including the example.

    David.

  • GilaMonster (6/11/2009)


    Tim Walker (6/10/2009)


    Question: (and I wouldn't have even asked it if I hadn't read this first) is what is the point in defining an index on a table variable if it has no statistics? I'm assuming that if the index is unique this will optimise well because of the 'one row in a table variable' technique but otherwise this won't help will it?

    It may help. Because there are no stats the optimiser will always think that any operation against the table variable will return one row (unique or non-unique index). That's a prime opportunity (as far as the optimiser's concerned) for an index seek/bookmark lookup. The problem is, if there's actually a lot of rows returned, that bookmark lookup will make the operation more expensive than if the table had been scanned.

    Thanks for the reply, particularly since my question was worded so badly! Thanks WayneS too.

    Tim

    .

  • Excellent Job Wayne. 🙂 That was perfectly written. Very informative and not over my head.. 🙂

    -Roy

  • The article should discuss the wonders of "option (recompile)". It forces SQL to recompile the query plan with knowledge of the number of rows currently in the table variables. It fixes almost all query plans with large table variables without having to use "force order". We use table variables for nearly everything. We now only use "force order" to stop the occassional "lazy spool" (wish I could disable that "feature" globally).

    declare @t table (ID int)

    insert into @t values (1), (2), (3), (4)

    set statistics profile on

    -- EstimateRows is 1

    select * from @t

    -- EstimateRows is 4

    select * from @t option (recompile)

    set statistics profile off

  • sscddr (6/11/2009)


    The article should discuss the wonders of "option (recompile)". It forces SQL to recompile the query plan with knowledge of the number of rows currently in the table variables.

    It's still not an accurate estimate. For full table scans the estimate is right, but not for queries with filters

    declare @t table (ID int )

    INSERT INTO @t (ID)

    SELECT top (100000) a.column_id

    from master.sys.columns a cross join master.sys.columns b

    -- EstimateRows is 1, Actual is 17134

    select * from @t where ID = 1

    -- EstimateRows is 5623, Actual is 17134

    select * from @t where ID = 1

    option (recompile)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Cool trick though. good to know.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • That it is. I'm trying to get some clarification as to where that row estimate comes from. It's not from statistics, there's no auto stats events fired and no stats visible in TempDB

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It would be interesting to do some testing to see at what point the cost of the recomplications outweighs the overhead of #temp tables.

    I'm taking Kalen Delaney's course in two weeks. I'll try to remember to ask her about this.

  • Dean Cochrane (6/11/2009)


    It would be interesting to do some testing to see at what point the cost of the recomplications outweighs the overhead of #temp tables.

    I'm taking Kalen Delaney's course in two weeks. I'll try to remember to ask her about this.

    What #temp table overhead are you talking abut? One of the tings that Wayne's article reveals is that temp table and table variables have *almost* identical overhead.

    The real question isn't "at what point the cost of the recomplications outweighs the overhead of #temp tables", rather it's "to what extent does using inaccurate statistics instead of no statistics compensate for the cost of recompiling?"

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

Viewing 15 posts - 31 through 45 (of 163 total)

You must be logged in to reply to this topic. Login to reply