TRUNCATE and INSERT INTO slower than TRUNCATE GO INSERT INTO

  • Hi guys,

    i wonder why executing  a TRUNCATE and INSERT INTO Command is mutch slower than  executing these commands separately ?

    Conditions:
    recovery model: simple
    table1  & table 2 have same structure &  clustered index.
    row count: 10,000,000
    Data space: 1.8 GB

    Using GO  results in 50% less execution time. On my system : 34 to 65 sec

    TRUNCATE TABLE table1

    --GO --without  slower

    INSERT INTO table1  WITH (TABLOCK)
       SELECT  *   FROM   table2

    Why is this slower ?
    Is it possible to get the same performance in a stored procedure?

    Thanks for your help.

  • Not sure, but my guess? The go sends the batch, and SQL does the truncate, which defers some of the deallocations, then gets the insert and processes that right away. Without the GO, I suspect SQL tries to deallocate all the pages before it starts the insert. Just a guess.

  • mrpellepelle - Thursday, October 12, 2017 8:16 AM

    Hi guys,

    i wonder why executing  a TRUNCATE and INSERT INTO Command is mutch slower than  executing these commands separately ?

    Conditions:
    recovery model: simple
    table1  & table 2 have same structure &  clustered index.
    row count: 10,000,000
    Data space: 1.8 GB

    Using GO  results in 50% less execution time. On my system : 34 to 65 sec

    TRUNCATE TABLE table1

    --GO --without  slower

    INSERT INTO table1  WITH (TABLOCK)
       SELECT  *   FROM   table2

    Why is this slower ?
    Is it possible to get the same performance in a stored procedure?

    Thanks for your help.

    Interesting. If you compare total time elapsed (ie, include the time taken for the INSERT to complete), is the GO method still faster?

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • There should be no difference, is the whole query being executed? If not post ALL the code!
    😎

  • How consistently are you able to generate those results?

  • Hi,
    i did some more testing and checked the data file i/o behaviour:
    Without GO: writes in LDF. At the end high writes to MDF
    With GO: writes in  MDF. Almost no LDF writes

    @steve-2 Jones
    Does this support your  assumption?

    @ZZartin
    I tested this behaviour about 20 times on 2  systems with same hardware/software. I always come up with a time difference about 50%.

    @Eirikur Eiriksson
    It is all the code. :unsure:

    @Phil Parkin
    I am not sure if i get your point. I executed all the code with and without the GO command and checked the total execution time in SSMS

  • mrpellepelle - Friday, October 13, 2017 1:46 AM

    @Phil Parkin
    I am not sure if i get your point. I executed all the code with and without the GO command and checked the total execution time in SSMS

    Haha, now that I reread it, me neither. And I hadn't even been drinking. I think I had some notion about something executing asynchronously, but it doesn't stand up to close examination 🙂

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Not sure. I think so, but not sure. Let me ask someone else that might better understand.

  • Just curious whether there's any difference in the execution plans generated with versus without.

    Steve (aka sgmunson) 🙂 🙂 🙂
    Rent Servers for Income (picks and shovels strategy)

  • I sent a note off to Paul Randal. He says there's no difference and shouldn't be much at all.

    You've got something else going on. Log growths, checkpoints something. Is this db in simple mode?

  • mrpellepelle - Thursday, October 12, 2017 8:16 AM

    Hi guys,

    i wonder why executing  a TRUNCATE and INSERT INTO Command is mutch slower than  executing these commands separately ?

    Conditions:
    recovery model: simple
    table1  & table 2 have same structure &  clustered index.
    row count: 10,000,000
    Data space: 1.8 GB

    Using GO  results in 50% less execution time. On my system : 34 to 65 sec

    TRUNCATE TABLE table1

    --GO --without  slower

    INSERT INTO table1  WITH (TABLOCK)
       SELECT  *   FROM   table2

    Why is this slower ?
    Is it possible to get the same performance in a stored procedure?

    Thanks for your help.

    Possibly flying in the face of what the experts and a rather serious white paper on the subject claim...

    In the experiments I've been doing with the idea of achieving Minimally Logged INSERTs into an empty table, this seems to have to do with an undocumented "problem" that I've found with Minimal Logging and it's easily fixed.  The experiments I've been doing have to do with whether or not Minimal Logging would actually occur when a variable was in the code and the value of the variable wasn't known until Run Time.  I also did experiments with the idea that the INSERTs needed to occur in the same order as the target Clustered Index, which is documented.  I don't believe the ORDER BY problem comes into play here but I do believe that you're having a similar problem with TRUNCATE as I was having with Run Time Populated Variables and that is, the optimizer doesn't necessarily know that the table is being truncated and so assumes it's not.  As with the Run Time Populated Variables problem, I believe the fix is to the "TRUNCATE in the same batch" problem may be to  add OPTION(RECOMPILE) to the INSERT code.

    And, no... I've not yet done a deep dive on the EXECUTION PLANs in any case.  I only did repeatable tests do demonstrate the work around of using OPTION(RECOMPILE) in association with the Run Time variable problem.

    I'll also state that since it's a requirement for "Minimally Logged Inserts to a Clustered Index" that the input must be in the same order as the target Clustered Index, you shouldn't take a chance on whether or not you need to include an ORDER BY.  I've taken to always including it because, I've found, if it's needed, it will be used.  If it's not, it won't.  Nothing like a usually free guarantee.

    If you have the time, would you post the DDL, constraints, defaults, and indexes for your table?  I'd like to setup a test and the closer I can get to emulating your 10 million row table, the more definitive the test results will be.  Thanks.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • When i use OPTION(RECOMPILE) to the INSERT statement, there is no time difference anymore.
    GREAT, but why?

    @jeff Moden
    Anonymized, compressed DDL:

    CREATE TABLE [dbo].[contract](
     [contract_id]    [INT]    NOT    NULL,
    id1    [INT]    NULL,    id2    [INT]    NULL,   id3    [INT]    NULL,    id4    [INT]    NULL,    
    id5    [INT]    NULL,    id6    [INT]    NULL,   id7    [INT]    NULL,    id8    [INT]    NULL,    
    id9    [INT]    NULL,    id10    [SMALLINT]    NULL,    Id11    [INT]    NULL,  id12    [INT]    NULL,  id13    [INT]    NULL,  id14    [INT]    NULL,    
    id15    [INT]    NULL,   id16    [INT]    NULL, id17    [INT]    NULL,   id18    [INT]    NULL,    
    id19    [INT]    NULL,   id20    [INT]    NULL, id21    [INT]    NULL,   id22    [TINYINT]    NOT    NULL,
    id23    [TINYINT]    NOT    NULL, id24    [TINYINT]    NOT    NULL, id25    [TINYINT]    NOT    NULL, id26    [TINYINT]    NOT    NULL,
    id27    [TINYINT]    NOT    NULL, id28    [TINYINT]    NOT    NULL, id29    [TINYINT]    NOT    NULL, id30    [TINYINT]    NOT    NULL,
    id31    [TINYINT]    NOT    NULL, id32    [INT]    NOT    NULL,id33    [INT]    NULL,    

    CONSTRAINT [PK_contract] PRIMARY KEY CLUSTERED
    (
        [contract_id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]

    GO
    @sgmunson Execution plan is the same. Clustered Index scan -> Clusterd Index Insert
    @steve-2 Jones As mentioned in my first post, the model is simple.

  • mrpellepelle - Monday, October 16, 2017 3:22 AM

    When i use OPTION(RECOMPILE) to the INSERT statement, there is no time difference anymore.
    GREAT, but why?

    @jeff Moden
    Anonymized, compressed DDL:

    CREATE TABLE [dbo].[contract](
     [contract_id]    [INT]    NOT    NULL,
    id1    [INT]    NULL,    id2    [INT]    NULL,   id3    [INT]    NULL,    id4    [INT]    NULL,    
    id5    [INT]    NULL,    id6    [INT]    NULL,   id7    [INT]    NULL,    id8    [INT]    NULL,    
    id9    [INT]    NULL,    id10    [SMALLINT]    NULL,    Id11    [INT]    NULL,  id12    [INT]    NULL,  id13    [INT]    NULL,  id14    [INT]    NULL,    
    id15    [INT]    NULL,   id16    [INT]    NULL, id17    [INT]    NULL,   id18    [INT]    NULL,    
    id19    [INT]    NULL,   id20    [INT]    NULL, id21    [INT]    NULL,   id22    [TINYINT]    NOT    NULL,
    id23    [TINYINT]    NOT    NULL, id24    [TINYINT]    NOT    NULL, id25    [TINYINT]    NOT    NULL, id26    [TINYINT]    NOT    NULL,
    id27    [TINYINT]    NOT    NULL, id28    [TINYINT]    NOT    NULL, id29    [TINYINT]    NOT    NULL, id30    [TINYINT]    NOT    NULL,
    id31    [TINYINT]    NOT    NULL, id32    [INT]    NOT    NULL,id33    [INT]    NULL,    

    CONSTRAINT [PK_contract] PRIMARY KEY CLUSTERED
    (
        [contract_id] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]

    GO
    @sgmunson Execution plan is the same. Clustered Index scan -> Clusterd Index Insert
    @steve-2 Jones As mentioned in my first post, the model is simple.

    I already told you when I suggested it. 😉  It's an undocumented sometimes necessary thing to do to trick the optimizer into doing Minimal Logging in the presence of a Clustered Index.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi guys,

    any other solutions / explanations for this behaviour?
    I do not feel comfortable with adding OPTION(RECOMPILE) to  every INSERT   we are using to load  our data warehouse enviroment.
    I also tried TF 610, but it did not improve the timings.

  • mrpellepelle - Wednesday, October 18, 2017 3:13 AM

    Hi guys,

    any other solutions / explanations for this behaviour?
    I do not feel comfortable with adding OPTION(RECOMPILE) to  every INSERT   we are using to load  our data warehouse enviroment.
    I also tried TF 610, but it did not improve the timings.

    If it were for a GUI related proc that hit the database tens of thousands of times per hour, I could absolutely understand your angst about using OPTION(RECOMPILE).  Of course, you wouldn't be looking for "Minimal Logging" then.  For loading a DW in a batch mode, the Recompiles will be trivial compared to the time and resources they save and they're necessary, in this case, to trick the optimizer into realizing that it actually can do the "Minimal Logging".

    Unless someone can come up with a non-code-based trick to allow the optimizer to make the correct decision during run time, you're pretty much stuck with either slow/resource intensive code or the option to recompile.  I recommend that you get comfortable with the later. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply