November 17, 2009 at 2:07 pm
We write a lot of stored procedures for reporting purposes and in those stored procedures we create a lot of temp tables. Our DBA insists that we create the temp table first and then do an insert into that temp table instead of doing a select into temp. I Google'd this online and a lot of documentation on this talks about SQL Server 6.5 and the issues that it had with Select into (we are running SQL Server 2005). Does it still create a lock on tempdb? Is there really any performance gains to had from doing it in this manner?
November 17, 2009 at 2:14 pm
It hasn't been a problem in over a decade, but that doesn't stop lots of DBAs from thinking it is one.
I've done a lot of load and speed testing with both methods, and Select Into is generally faster and creates less contention than Create Insert Select. Not always faster, but more often than not.
The only place I've see it be slower is when you're dealing with very large amounts of undefined data. For example, Select Into from an OpenRowset on a text file or spreadsheet. With that, SQL Server has to scan the input to decide what data type to make a column. Usually, as per data on IMEX and the related registry entries, that's fast. But, if you have had to modify the registry to allow for full column scanning (mixed data types in Excel columns is the usual reason for that), then Select Into can become very slow compared to Create Insert Select.
If you aren't dealing with that particular situation, then Select Into is almost certainly going to be faster and more efficient. It also has advantages with regards to avoiding column truncation and a few things like that.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 17, 2009 at 3:22 pm
I think GSquared and I are having philosophical differences today. I generally don't do SELECT INTO because at design time I want to know exactly what the structure and all datatypes will be. As far as performance I really think in most cases we are talking fractions of fractions of seconds. But like many things, style may vary and you can test the variations to see what you like. Me, I like the Create/Insert method.
CEWII
November 17, 2009 at 3:38 pm
I've always forced our guys to define all temp tables at the beginning of the stored procedure. Interleaving DDL with DML causes the SPs to recompile. I was under the assumption that using SELECT...INTO...FROM also caused SP recompilations.
Is this still true in SQL Server 2005? If so, it would be another arguement for defining your temp tables with the CREATE TABLE construct.
November 18, 2009 at 7:19 am
John Rowan (11/17/2009)
I've always forced our guys to define all temp tables at the beginning of the stored procedure. Interleaving DDL with DML causes the SPs to recompile. I was under the assumption that using SELECT...INTO...FROM also caused SP recompilations.Is this still true in SQL Server 2005? If so, it would be another arguement for defining your temp tables with the CREATE TABLE construct.
In SQL 2000, this was partially true. I say partially because if the data in the temp table changes significantly, it would have to recompile again. Having all the DDL at the begining of the proc only avoids recompiles if the stats in the temp tables don't change enough during runtime to require building new execution plans for later steps in the proc. Since a heavy insert or any update on an indexed column can cause stat changes, it's pretty hard to avoid this issue, regardless of sequence of DDL/DML. The main way to avoid this issue is to use table variables, since they don't involve recompiles on DDL nor do they keep stats.
In SQL 2005, it can cause statement recompilation, but doesn't generally cause whole-proc recompilation. SQL 2005 will also hold onto the definition and first page of data for later runs of the proc if it calculates that this will save recompiles and execution time.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 18, 2009 at 7:27 am
Elliott W (11/17/2009)
I think GSquared and I are having philosophical differences today. I generally don't do SELECT INTO because at design time I want to know exactly what the structure and all datatypes will be. As far as performance I really think in most cases we are talking fractions of fractions of seconds. But like many things, style may vary and you can test the variations to see what you like. Me, I like the Create/Insert method.CEWII
Correct. The speed difference only matters if your database will have to scale to heavy use.
But you do mention one thing that I consider a drawback to Create Insert as opposed to Select Into: You want to know the exact structure and datatype at design time, whereas I see it as you must know the exact structure and data type for all future runtimes.
I recently had a problem come up because a varchar column in a table had to be expanded by 5 characters to accommodate a new data case, and a proc with a temp table in it didn't have it's column also expanded by 5 characters. Developer overlooked the issue. Unit testing wasn't adequate, and the table change was taken live. Simple human error.
Select Into avoids that issue. You can, as needed, change a base table, without rippling the change through every proc that uses a temp table on it.
Of course, this isn't a big deal. It's a minor difference. There are advantages and drawbacks either way. I see the advantages for Select Into as outweighing the drawbacks, but that certainly doesn't mean I'm right and you're wrong, or vice versa, it just means we weigh these things differently.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 18, 2009 at 10:02 am
GSquared (11/18/2009)
Of course, this isn't a big deal. It's a minor difference. There are advantages and drawbacks either way. I see the advantages for Select Into as outweighing the drawbacks, but that certainly doesn't mean I'm right and you're wrong, or vice versa, it just means we weigh these things differently.
And that is ok. I have always said there is more than one way to solve most problems, and often more than one good way. This is more style and outlook than hard performance and best practive. And again, that is ok..
CEWII
November 18, 2009 at 12:08 pm
If the table you are selecting into will later have a clustered index created on it, it is usually faster to create the table first and insert the data in order by the clustered index than to load the whole table and then reload it by creating a clustered index.
November 18, 2009 at 12:47 pm
Michael Valentine Jones (11/18/2009)
If the table you are selecting into will later have a clustered index created on it, it is usually faster to create the table first and insert the data in order by the clustered index than to load the whole table and then reload it by creating a clustered index.
Yep. That can make a difference too. Matters more on tables that are large enough to overflow onto disk as opposed to staying in RAM, but definitely something to consider.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 18, 2009 at 12:49 pm
Elliott W (11/18/2009)
GSquared (11/18/2009)
Of course, this isn't a big deal. It's a minor difference. There are advantages and drawbacks either way. I see the advantages for Select Into as outweighing the drawbacks, but that certainly doesn't mean I'm right and you're wrong, or vice versa, it just means we weigh these things differently.And that is ok. I have always said there is more than one way to solve most problems, and often more than one good way. This is more style and outlook than hard performance and best practive. And again, that is ok..
CEWII
One main advantage to create first insert second is that you can document that much more easily and clearly.
Another is that you can add constraints before you insert.
It definitely goes both ways.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 19, 2009 at 2:33 am
Just a couple of other points that spring to mind:
SELECT INTO can be minimally logged if the recovery model is BULK_LOGGED or SIMPLE.
SELECT INTO is limited to creating a table on the default file group.
ISNULL() together with an explicit CAST or CONVERT can help document and enforce column types when creating a table with SELECT INTO:
SELECT
c1 = CONVERT(TINYINT, 1),
c2 = CONVERT(VARCHAR(10), 'Fish'),
c3 = CONVERT(NVARCHAR(5), N'Chips'),
c4 = CONVERT(SMALLDATETIME, CURRENT_TIMESTAMP),
c5 = CONVERT(MONEY, $5.99),
c6 = CONVERT(UNIQUEIDENTIFIER, NEWID())
INTO
#Disposable;
GO
-- All columns are NULLable
EXECUTE tempdb..sp_columns @table_name = '#Disposable';
GO
DROP TABLE #Disposable;
GO
SELECT
c1 = ISNULL(CONVERT(TINYINT, 1), 0),
c2 = ISNULL(CONVERT(VARCHAR(10), 'Fish'), ''),
c3 = ISNULL(CONVERT(NVARCHAR(5), N'Chips'), N''),
c4 = ISNULL(CONVERT(SMALLDATETIME, CURRENT_TIMESTAMP), {d '2000-01-01'}),
c5 = ISNULL(CONVERT(MONEY, $5.99), $0.00),
c6 = ISNULL(CONVERT(UNIQUEIDENTIFIER, NEWID()), {guid'00000000-0000-0000-0000-000000000000'})
INTO
#Disposable;
GO
-- All columns are NOT NULLable
EXECUTE tempdb..sp_columns @table_name = '#Disposable';
GO
DROP TABLE #Disposable;
Viewing 11 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy