SELECT * INTO vs INSERT INTO Performance

  • Myself and my colleagues have noticed that performance can be significantly worse when running

    SELECT * INTO tempTable FROM sourceTable

    rather than running

    SELECT TOP 1 * INTO tempTable FROM sourceTable INSERT INTO tempTable SELECT * FROM sourceTable

    Can anyone explain why this might be, or provide links?

    Thanks!

  • It depends on what you're selecting from, so far as I know.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • JohnnyDBA (10/27/2010)


    Myself and my colleagues have noticed that performance can be significantly worse when running

    SELECT * INTO tempTable FROM sourceTable

    rather than running

    SELECT TOP 1 * INTO tempTable FROM sourceTable INSERT INTO tempTable SELECT * FROM sourceTable

    Can anyone explain why this might be, or provide links?

    Thanks!

    Is it really a SELECT * statement or are the columns defined?

    http://blog.sqlauthority.com/2010/03/03/sql-server-performance-comparison-insert-top-n-into-table-using-top-with-insert/

  • It's really a SELECT * statement.

    After further research we're seeing that the performance issues can occur using either a SELECT * INTO or an INSERT INTO depending on the situation but we do not know what is triggering the situation yet.

    GSquared, can you elaborate on the "it depends on the columns" statement?

    We're going to be researching this but if anyone has more information that would really help because it's causing certain queries that should take 1-3 seconds to take up to 2 minutes on relatively small tables.

  • Actually, what I wrote is it depends on what you're selecting from.

    In order to do a Select Into, SQL has to determine data types, and potentially data sizes, from the source data. So, if those are other SQL tables, it can use its own metadata to generate the right columns very easily. On the other hand, if it's a text or Excel file, it has to read a few rows, parse out their type and size, and then attempt an insert. That adds a fair amount of overhead to the process, which can be avoided by explicitly defining the target table instead of having SQL Server figure it out during runtime.

    On the other hand, that's a moot point if both are intermittently slow. That means you're either running into issues with the amount of data being variable, or with something like tempdb having to grow (possibly multiple times) in order to process the data.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • It's not intermittent meaning that sometimes we run the SELECT * INTO and it's slow and other times we run the same SELECT * INTO it's fast...

    It's more like some tables we are running this process on are slow when using SELECT * INTO and some tables we run this process on are slow when using an INSERT INTO.

    We haven't found the link indicating why it would be slow using one vs the other yet.

  • Are the queries different?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I think "sourceTable" might be locked by a transaction at the time you are running the statements...try using the "(nolock)" hint. I'm curious to see if it's the point.

    Best regards.

    Guilherme.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply