Things You Didn't Know About Temp Tables and Table Variables

  • Jeff Moden (7/18/2008)


    Again, I wouldn't use a correlated subquery for this. But that's not the big problem here... what does this query do to the @tab table? Nothing... ID remains the same.

    Hi Jeff,

    You are correct, I am updating the id with the same value.

    But since this is only an example to show the syntax, i would not put too much meaning into this.

    But thanks for your hint.

    Best Regards,

    Chris Büttner

  • Thanks Christian...heh... yeah, I was just wondering if the OP understood that. Very strange sample code to begin with...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Adam Machanic (7/18/2008)


    Jeff Moden (7/18/2008)


    Viswanath S. Dhara (7/18/2008)


    I am trying to update a table using a correlated subquery.

    Why? Correlated subqueries are a form of RBAR and can actually be worse than a cursor in some cases. Use a correctly formed joined update instead...

    Your "correctly formed" update is non-ANSI Compliant and is nondeterministic in some cases. The query optimizer can--and usually does--optimize out correlated subqueries so that they're not reevaluated on a row-by-row basis. Certainly not worse than a cursor.

    That would be true if there is only 1 column being updated... each correlation in an Update in SQL Server 2000 (haven't checked 2k5, yet) adds about the same amount of time.

    To be clear, correlated sub-queries can produce performance thousands of times worse than a cursor depending on the usage and the data... please see the following...

    http://www.sqlservercentral.com/articles/T-SQL/61539/

    Admittedly, that type of correlation won't work in an update and I apologize for confusing the issue...

    ... And, please tell me that your not one of those folks that writes only ANSI compliant code. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • By the way... Folks, say hi to Adam Mechanic... he's the guy that first introduced me to the idea of what I currently call a Tally table.

    Thanks again for that, Adam.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Is there really any difference or advantage between Temp Tables and Table variables? If there is not enough cache on hand and both will eventually hit TempDB - is the only advantage that Table variables dont hit it explicitly to start?

    While both the temp table and the table variable will utilize the TempDB essentially the same way (IE. Memory till table size reaches a certain point), the table variable will not write to the transaction log (which is why it is unaffected by Rollbacks). This can decrease the overall I/O used by an operation. Other consideration include statistics. for a table variable, there is always considered to be 1 row by the optimizer, even if there are tens of thousands, which can result in a less than optimum choice for the join (nested loop vs. hash vs. merge). also, Temp Tables can trigger recompiles of stored procedures that create the object, as the acutal execution plan cannot be persisted (table reference doesn' exist until you are executing), which can also be costly.

  • There have already been a couple replies on this so I won't belabor too much, but you definitely can create a PK on a table variable, you just can't name it. You can also create an index and apply any appropriate attributes as well (unique, clustered, nonclustered, etc.). SQL Server won't let you name these objects for exactly the reason described: possibility of naming collisions.

    As to the questions about advantages/disadvantages, here's my take: The table varaiable will give you a performance boost if you can keep its contents skinny and short (translation, fewest possible columns and fewest possible rows). From what I've read and heard, SQL Server tries its best to keep the table var's data memory-resident and will only "spill-over" into tempdb if surpasses a certain threshold. It also does help to add an index or two to the table var if you are joining, sorting or grouping on column(s). Don't go hog wild with indexes or you'll make too much work for the engine and very quickly lose any advantage you may have gained. The other performance boost comes with the fact that it is not a persisted object. Because it is not included in the transaction it does not add write I/O to the log file. And because it automatically falls out of scope when the batch completes it is very quickly and efficiently deleted without executing an additional statement in your code.

    I've used table var's quite a bit over the past couple years and am very happy with the performance. One of my most common uses is when I need to left join a table but I only need a subset of its data. I have had very good results when I load the subset into the table var and then left join that in the parent query.

    Speaking of efficiency, please don't forget about table-valued functions! These puppies are executed in their own memory context and are extremely fast. I think that TVF's are one of the most under-utilized great features of SQL 2000/2005. If you're ever thinking you've got an application for a table variable but wish you could filter its contents, there's your answer.

  • Well, this is why I don't post a whole lot, I sometimes type without thinking! You CANNOT execute any DDL for the table var outside of its declaration, therefore you CANNOT create indexes. Inline constraints are the only option for table var's. If you do need indexes, temp tables may be a better option.

    One other way table vars can increase performance is that they tend to trigger fewer recompiles in your sp's...

  • PKs and UNIQUE constraints are backed by indexes, so you can create indexes, as long as they're unique. Need a non-unique index? Add an identity column and use it as a key to create uniqueness... There are ways around virtually any limitation!

    --
    Adam Machanic
    whoisactive

  • Gr8...thank u for the wounderful article

    --

    Rijosh

  • foursaints (9/12/2008)


    There have already been a couple replies on this so I won't belabor too much, but you definitely can create a PK on a table variable, you just can't name it. You can also create an index and apply any appropriate attributes as well (unique, clustered, nonclustered, etc.). SQL Server won't let you name these objects for exactly the reason described: possibility of naming collisions.

    As to the questions about advantages/disadvantages, here's my take: The table varaiable will give you a performance boost if you can keep its contents skinny and short (translation, fewest possible columns and fewest possible rows). From what I've read and heard, SQL Server tries its best to keep the table var's data memory-resident and will only "spill-over" into tempdb if surpasses a certain threshold. It also does help to add an index or two to the table var if you are joining, sorting or grouping on column(s). Don't go hog wild with indexes or you'll make too much work for the engine and very quickly lose any advantage you may have gained. The other performance boost comes with the fact that it is not a persisted object. Because it is not included in the transaction it does not add write I/O to the log file. And because it automatically falls out of scope when the batch completes it is very quickly and efficiently deleted without executing an additional statement in your code.

    I've used table var's quite a bit over the past couple years and am very happy with the performance. One of my most common uses is when I need to left join a table but I only need a subset of its data. I have had very good results when I load the subset into the table var and then left join that in the parent query.

    Speaking of efficiency, please don't forget about table-valued functions! These puppies are executed in their own memory context and are extremely fast. I think that TVF's are one of the most under-utilized great features of SQL 2000/2005. If you're ever thinking you've got an application for a table variable but wish you could filter its contents, there's your answer.

    Temp tables act the same as table variables when it comes to memory... they, too, are extremely fast when used properly... read Q4/A4 in the following link...

    ... which says...

    Q4: Are table variables memory-only structures that are assured better performance as compared to temporary or permanent tables, because they are maintained in a database that resides on the physical disk?

    A4: A table variable is not a memory-only structure. Because a table variable might hold more data than can fit in memory, it has to have a place on disk to store data. Table variables are created in the tempdb database similar to temporary tables. If memory is available, both table variables and temporary tables are created and processed while in memory (data cache).

    Sure, sure... use of a Temp Table normally guarantees a recompile... there's a lot of other things like small changes in data that also guarantee a recompile. Expected lifetime of an execution plan on a busy system is about 5 minutes. AND, table variables are guaranteed to evaluate as a single row table... not always the best thing for an optimal execution plan.

    Don't use either a Temp Table or Table Variable based on rules of thumb. Do some testing... both will surprise you under different circumstances.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • We can add constraints on table variable like primary key, unique key and chekc constraints etc.. for eg.:

    DECLARE @MyTable TABLE

    (

    ProductID int primary key IDENTITY(1,1),

    Price money CHECK(Price < 10.0)

    )

    We cannot create non clustered index on table variable also we cannot change the declaration of table variable once created i.e. we cannot alter the table variable.

  • sandeep.pote (9/23/2008)


    We can add constraints on table variable like primary key, unique key and chekc constraints etc ...

    ...

    We cannot create non clustered index on table variable ...

    Just to clarify this statement (again!):

    Both PKs and UNIQUE constraints are backed by indexes. By default, UNIQUE constraints are backed by nonclustered indexes. So by creating one of these constraints, you are in fact creating a nonclustered index. Therefore, it's fair to say that you can create both clustered and nonclustered indexes on a table variable.

    Need a non-unique index? No problem, just use an IDENTITY column as a "uniquifier"...

    DECLARE @n TABLE

    (

    x INT NOT NULL,

    y INT IDENTITY(1,1) NOT NULL,

    PRIMARY KEY(x,y)

    )

    The IDENTITY column will of course increase the row size, so if you do this and you won't be inserting more than ~64,000 rows, consider using SMALLINT rather than INT (you can seed the IDENTITY with -32,000 to give yourself more room).

    --
    Adam Machanic
    whoisactive

Viewing 12 posts - 31 through 41 (of 41 total)

You must be logged in to reply to this topic. Login to reply