SELECT vs INSERT INTO

  • I've been searching for pointers on SELECT..INTO vs INSERT..INTO around forums and the web and haven't been successful yet. Basically, if you are using temp tables in a stored procedure is there a pros/cons of using SELECT something INTO #temptable versus explicitly defining the table and then using INSERT INTO #temptable? Is the answer affected by the number of rows, width of table, use of indexes on the temp tables, etc? Any advice or pointers to articles would be awesome.

    Thanks for any direction.

    Cliff

  • I believe that SELECT...INTO results in less impact on the transaction log. Also, if there are differences in collation between the db you're in and tempdb, SELECT...INTO will create a temporary table with the collation of the table you're selecting from, while creating the table using CREATE TABLE will use the collation of tempdb. Keep in mind that neither method will give any logical structure to your temporary table: remember to put a PK or clustered index on it, just as you would a permanent table.

    Roland Alexander 
    The Monday Morning DBA 
    There are two means of refuge from the miseries of life: music and cats. ~ Albert Schweitzer

  • Also select into will throw an exception if the the destination table already exists. You do not have as much control over the datatypes when using a select into either, it will use the datatype from the source. That is not a bad thing but something to understand about how it works. For example if you have a varchar(10) but you know you will want to update the table after you insert your data and the new value will be longer you probably don't want to use select into. Neither approach is the right or the wrong answer. They both work just fine. There are implications of either approach that need to be considered when writing your code.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (1/11/2013)


    Also select into will throw an exception if the the destination table already exists. You do not have as much control over the datatypes when using a select into either, it will use the datatype from the source. That is not a bad thing but something to understand about how it works. For example if you have a varchar(10) but you know you will want to update the table after you insert your data and the new value will be longer you probably don't want to use select into. Neither approach is the right or the wrong answer. They both work just fine. There are implications of either approach that need to be considered when writing your code.

    Good points! As to the destination table already existing, that shouldn't be a problem if you're selecting into a temp table. SQL Server will give it a unique name in the db even if you don't (i.e. you can create multiple instances of "#tmp" concurrently but each has a unique name in the system table).

    Roland Alexander 
    The Monday Morning DBA 
    There are two means of refuge from the miseries of life: music and cats. ~ Albert Schweitzer

  • Roland Alexander STL (1/11/2013)


    As to the destination table already existing, that shouldn't be a problem if you're selecting into a temp table. SQL Server will give it a unique name in the db even if you don't (i.e. you can create multiple instances of "#tmp" concurrently but each has a unique name in the system table).

    That isn't true. If it was we would never be able to use them. It is also painlessly easy to prove.

    select top 5 *

    into #MyTable

    from sys.objects

    select top 5 *

    into #MyTable

    from sys.objects

    😀

    --EDIT--

    Now if the same temp table was in a different batch your statement is true.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Sean Lange (1/11/2013)


    Roland Alexander STL (1/11/2013)


    As to the destination table already existing, that shouldn't be a problem if you're selecting into a temp table. SQL Server will give it a unique name in the db even if you don't (i.e. you can create multiple instances of "#tmp" concurrently but each has a unique name in the system table).

    That isn't true. If it was we would never be able to use them. It is also painlessly easy to prove.

    select top 5 *

    into #MyTable

    from sys.objects

    select top 5 *

    into #MyTable

    from sys.objects

    😀

    --EDIT--

    Now if the same temp table was in a different batch your statement is true.

    You're right, of course, and I should have made clear that by "concurrent" I meant "concurrent but separate connections": so that if you and I each have a connection and each execute the SP in question, there won't be any collisions. Thanks for the correction.

    Roland Alexander 
    The Monday Morning DBA 
    There are two means of refuge from the miseries of life: music and cats. ~ Albert Schweitzer

  • Thanks for the great input. It was helpful.

    Cliff

  • Sean Lange (1/11/2013)


    You do not have as much control over the datatypes when using a select into either, it will use the datatype from the source. That is not a bad thing but something to understand about how it works.

    We can manipulate this behavior though. Say I have a column that is a NVARCHAR(128) and I want it to be a NVARCHAR(500) in my temp table if I cast that column in my SELECT-column-list SQL Server will honor the casted datatype when the temp table is created, e.g.

    SELECT object_id,

    CAST(name AS NVARCHAR(500)) AS name

    INTO #TempTable

    FROM sys.tables;

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • i think "insert into select " provide better readabilty when we use this approach in "long time used queries/stored proc". we can easily see what are the columns and what kind of data we are going to store ? .Easy to troubleshoot.

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Bhuvnesh (1/14/2013)


    i think "insert into select " provide better readabilty when we use this approach in "long time used queries/stored proc". we can easily see what are the columns and what kind of data we are going to store ? .Easy to troubleshoot.

    I agree with this. Explicitly creating the table and doing an "INSERT INTO SELECT [...]" is self-documenting and when you have to go back and look at the code a year later you will know the data structure of the temp table without having to do a lot of digging into the procedure. Creating the table first also enforces a bit of discipline by forcing the developer to think about what data is going to be handled by a procedure (as opposed to the dynamic SELECT INTO approach).

    Also, as someone mentioned above, it's important to have a primary key even on a temp table and in my experience I've found that inexperienced developers tend to use SELECT INTO because it's "easier" and wind up creating a heap because they just don't know any better. For my shop I have a standard to use for temp tables and on my own workstation I have a hot key for the RedGate snippet manager that inserts a template on demand:

    IF OBJECT_ID('tempdb..#TempTable') IS NOT NULL

    DROP TABLE #TempTable

    CREATE TABLE #TempTable (

    [ID] INT IDENTITY(1,1) NOT NULL,

    [Col1] INT NULL,

    [Col2] NVARCHAR(50) NULL,

    PRIMARY KEY (ID))

    INSERT INTO #TempTable

    SELECT

    Col1

    ,Col2

    FROM

    dbo.TableName

    WHERE

    1 = 1

    Another issue to consider is recompilation. Mixing DDL and DML within a procedure can cause recompilation. In the code snippet above the DDL "CREATE TABLE" comes prior to the DML "INSERT INTO". But using "SELECT INTO" in the wrong order can cause recompilation every time a procedure is run. The procedure should be tested to make sure recompilation is not occurring. (See Optimizing SQL Server Stored Procedures to Avoid Recompiles[/url] and Troubleshooting stored procedure recompilation.)

    A final note to consider is the creation of indexes on a temp table. As you can see in the snippet example above, SQL will create it's own uniquely-named PK when the table is created. However, if a primary key or index is added after the temp table is created there is no guarantee that the name of the key or index will be unique. This can cause an error if the procedure is creating a temp table dynamically for one user and then tries to create the "same" temp table for a different instance of the procedure because the names of the keys or indexes may already exist.

    When explicitly creating a temp table with the Primary Key attribute the PK will be named something like this:

    [PK__#TempTable__3214EC27582F7143]

    Every time the procedure is called, it will create a unique temp table object and PK. If a temp table is created like below a unique PK will also be created:

    SELECT ProductID, CategoryID

    INTO #TempTable

    FROM dbo.Products

    ALTER TABLE dbo.Products ADD PRIMARY KEY(ProductID)

    Though in my experience it's rarely needed, if for some reason additional keys are deemed necessary for a temp table care must be taken to make sure the keys have unique names. For example:

    DECLARE

    @strCreateIndex NVARCHAR(1000)

    ,@IndexUniqueID NVARCHAR(50)

    SET @IndexUniqueID = N'IX__#TempTable__'+REPLACE(CAST(NEWID() AS NVARCHAR(50)),'-','')

    SET @strCreateIndex =

    N'CREATE NONCLUSTERED INDEX [' + @IndexUniqueID + '] ON [#TempTable] ([ProductID])'

    EXEC sp_executesql @strCreateIndex

  • Steven Willis (1/14/2013)


    ...if for some reason additional keys are deemed necessary for a temp table care must be taken to make sure the keys have unique names. For example:

    DECLARE

    @strCreateIndex NVARCHAR(1000)

    ,@IndexUniqueID NVARCHAR(50)

    SET @IndexUniqueID = N'IX__#TempTable__'+REPLACE(CAST(NEWID() AS NVARCHAR(50)),'-','')

    SET @strCreateIndex =

    N'CREATE NONCLUSTERED INDEX [' + @IndexUniqueID + '] ON [#TempTable] ([ProductID])'

    EXEC sp_executesql @strCreateIndex

    Primary Keys must be unique across an entire database but index names only need to be unique within the scope of a given table.

    CREATE TABLE #a (id INT)

    CREATE INDEX ix1 ON #a(id)

    CREATE TABLE #b (id INT)

    CREATE INDEX ix1 ON #b(id)

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Thanks Steven. One motivation was to see about performance difference and common steps to use temp table. The other motivation was to work towards a coding standard like you mentioned.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply