Insert Faster way

  • Guys,
    I need to improve this Insert in a faster way

    Query :
    set identity_insert [Test].dbo.JOB ON
    INSERT INTO [Test].[dbo].JOB (
       [JOB_ID]
      ,[JOB_DEFINITION_ID]
      ,[PROCESS_ID]
      ,[PARENT_JOB_ID]
      ,[PREVIOUS_BATCH_ID]
      ,[FROM_ID]
      ,[FROM_SOURCE_ID]
      ,[FROM_SESSION_ID]
      ,[TO_ID]
      ,[TO_SOURCE_ID]
      ,[TO_SESSION_ID]
      ,[JOB_STATUS]
      ,[JOB_INFO]
      ,[LOCKED_BY]
      ,[UDATE]
      ,[CDATE]
    )
    SELECT
     [JOB_ID]
      ,[JOB_DEFINITION_ID]
      ,[PROCESS_ID]
      ,[PARENT_JOB_ID]
      ,[PREVIOUS_BATCH_ID]
      ,[FROM_ID]
      ,[FROM_SOURCE_ID]
      ,[FROM_SESSION_ID]
      ,[TO_ID]
      ,[TO_SOURCE_ID]
      ,[TO_SESSION_ID]
      ,[JOB_STATUS]
      ,[JOB_INFO]
      ,[LOCKED_BY]
      ,[UDATE]
      ,[CDATE]
     FROM [CR_Test].DBO.JOB with(nolock)
    set identity_insert [Test].dbo.JOB OFF

    Source table has : 11724192

    Any suggestions? thx

  • What if you insert the data in batches? That would make the transactions smaller, and maybe faster.

  • You mean to say like using a While loop?

  • RGP - Thursday, January 3, 2019 1:01 PM

    Guys,
    I need to improve this Insert in a faster way

    Query :
    set identity_insert [Test].dbo.JOB ON
    INSERT INTO [Test].[dbo].JOB (
       [JOB_ID]
      ,[JOB_DEFINITION_ID]
      ,[PROCESS_ID]
      ,[PARENT_JOB_ID]
      ,[PREVIOUS_BATCH_ID]
      ,[FROM_ID]
      ,[FROM_SOURCE_ID]
      ,[FROM_SESSION_ID]
      ,[TO_ID]
      ,[TO_SOURCE_ID]
      ,[TO_SESSION_ID]
      ,[JOB_STATUS]
      ,[JOB_INFO]
      ,[LOCKED_BY]
      ,[UDATE]
      ,[CDATE]
    )
    SELECT
     [JOB_ID]
      ,[JOB_DEFINITION_ID]
      ,[PROCESS_ID]
      ,[PARENT_JOB_ID]
      ,[PREVIOUS_BATCH_ID]
      ,[FROM_ID]
      ,[FROM_SOURCE_ID]
      ,[FROM_SESSION_ID]
      ,[TO_ID]
      ,[TO_SOURCE_ID]
      ,[TO_SESSION_ID]
      ,[JOB_STATUS]
      ,[JOB_INFO]
      ,[LOCKED_BY]
      ,[UDATE]
      ,[CDATE]
     FROM [CR_Test].DBO.JOB with(nolock)
    set identity_insert [Test].dbo.JOB OFF

    Source table has : 11724192

    Any suggestions? thx

    What indexes does this table have and are there any FKs?  Also, any chance of you temporarily slipping the database into the BULK-LOGGED Recovery Model?

    Also, not many people know it but setting IDENTITY INSERT ON frequently results in a full sort of all the data using SORT_IN_TEMPDB behind the scenes.  I did find a link where a fellow demonstrates a way around that using SWITCH but I've not yet tried it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Thursday, January 3, 2019 6:24 PM

    RGP - Thursday, January 3, 2019 1:01 PM

    Guys,
    I need to improve this Insert in a faster way

    Query :
    set identity_insert [Test].dbo.JOB ON
    INSERT INTO [Test].[dbo].JOB (
       [JOB_ID]
      ,[JOB_DEFINITION_ID]
      ,[PROCESS_ID]
      ,[PARENT_JOB_ID]
      ,[PREVIOUS_BATCH_ID]
      ,[FROM_ID]
      ,[FROM_SOURCE_ID]
      ,[FROM_SESSION_ID]
      ,[TO_ID]
      ,[TO_SOURCE_ID]
      ,[TO_SESSION_ID]
      ,[JOB_STATUS]
      ,[JOB_INFO]
      ,[LOCKED_BY]
      ,[UDATE]
      ,[CDATE]
    )
    SELECT
     [JOB_ID]
      ,[JOB_DEFINITION_ID]
      ,[PROCESS_ID]
      ,[PARENT_JOB_ID]
      ,[PREVIOUS_BATCH_ID]
      ,[FROM_ID]
      ,[FROM_SOURCE_ID]
      ,[FROM_SESSION_ID]
      ,[TO_ID]
      ,[TO_SOURCE_ID]
      ,[TO_SESSION_ID]
      ,[JOB_STATUS]
      ,[JOB_INFO]
      ,[LOCKED_BY]
      ,[UDATE]
      ,[CDATE]
     FROM [CR_Test].DBO.JOB with(nolock)
    set identity_insert [Test].dbo.JOB OFF

    Source table has : 11724192

    Any suggestions? thx

    What indexes does this table have and are there any FKs?  Also, any chance of you temporarily slipping the database into the BULK-LOGGED Recovery Model?

    Also, not many people know it but setting IDENTITY INSERT ON frequently results in a full sort of all the data using SORT_IN_TEMPDB behind the scenes.  I did find a link where a fellow demonstrates a way around that using SWITCH but I've not yet tried it.

    Thanks for the suggestions.
    The target table has one clustered Index.

    Please see below.

    I will try setting the target DB to bulk logged recovery mode and see how it works./****** Object: Index [ClusteredIndex-20190103-142533]  Script Date: 2019-01-04 1:13:55 PM ******/
    CREATE CLUSTERED INDEX [ClusteredIndex-20190103-142533] ON [dbo].[JOB]
    (
        [JOB_ID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 50)
    GO

  • RGP - Friday, January 4, 2019 10:16 AM

    Jeff Moden - Thursday, January 3, 2019 6:24 PM

    RGP - Thursday, January 3, 2019 1:01 PM

    Guys,
    I need to improve this Insert in a faster way

    Query :
    set identity_insert [Test].dbo.JOB ON
    INSERT INTO [Test].[dbo].JOB (
       [JOB_ID]
      ,[JOB_DEFINITION_ID]
      ,[PROCESS_ID]
      ,[PARENT_JOB_ID]
      ,[PREVIOUS_BATCH_ID]
      ,[FROM_ID]
      ,[FROM_SOURCE_ID]
      ,[FROM_SESSION_ID]
      ,[TO_ID]
      ,[TO_SOURCE_ID]
      ,[TO_SESSION_ID]
      ,[JOB_STATUS]
      ,[JOB_INFO]
      ,[LOCKED_BY]
      ,[UDATE]
      ,[CDATE]
    )
    SELECT
     [JOB_ID]
      ,[JOB_DEFINITION_ID]
      ,[PROCESS_ID]
      ,[PARENT_JOB_ID]
      ,[PREVIOUS_BATCH_ID]
      ,[FROM_ID]
      ,[FROM_SOURCE_ID]
      ,[FROM_SESSION_ID]
      ,[TO_ID]
      ,[TO_SOURCE_ID]
      ,[TO_SESSION_ID]
      ,[JOB_STATUS]
      ,[JOB_INFO]
      ,[LOCKED_BY]
      ,[UDATE]
      ,[CDATE]
     FROM [CR_Test].DBO.JOB with(nolock)
    set identity_insert [Test].dbo.JOB OFF

    Source table has : 11724192

    Any suggestions? thx

    What indexes does this table have and are there any FKs?  Also, any chance of you temporarily slipping the database into the BULK-LOGGED Recovery Model?

    Also, not many people know it but setting IDENTITY INSERT ON frequently results in a full sort of all the data using SORT_IN_TEMPDB behind the scenes.  I did find a link where a fellow demonstrates a way around that using SWITCH but I've not yet tried it.

    Thanks for the suggestions.
    The target table has one clustered Index.

    Please see below.

    I will try setting the target DB to bulk logged recovery mode and see how it works./****** Object: Index [ClusteredIndex-20190103-142533]  Script Date: 2019-01-04 1:13:55 PM ******/
    CREATE CLUSTERED INDEX [ClusteredIndex-20190103-142533] ON [dbo].[JOB]
    (
        [JOB_ID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 50)
    GO

    Careful...don't jump the gun.  Just setting the database to Bulk Logged isn't going to help.  It's just a pre-requisite for minimal logging.

    Here are more requirements to achieve minimal logging with a clustered index in place using INSERT/SELECT.
    1.  The object of the INSERT must be locked with a WITH(TABLOCK).
    2.  The INSERT must be done in the same order that the Clustered Index is.  I recommend you do an ORDER BY and SQL Sever will skip it if it's not needed.
    3.  It's undocumented but, through some really bad experiences, I've learned the hard way that if you have ANY variables in the INSERT/SELECT, you MUST use OPTION(RECOMPILE) or things might not be minimally logged.  This the time of the recompile is trivial compared to the actual INSERT, I ALWAYS include OPTION(RECOMPILE) in the INSERT/SELECT.

    Further, and like I said, the SET IDENTITY INSERT ON thing has an undocumented problem that may queer the whole deal by causing a totally unnecessary SORT_IN_TEMPDB.  I've not had the time to find the link that I was talking about that suggested a SWITCH  could be used to overcome that but I'll find it tonight.

    Last but not least, what the heck is up with the 50% FILL FACTOR?  šŸ˜‰  It won't help a thing for the sequential inserts you're doing.  Are you updating the data in the table after the load?  If so, post the CREATE TABLE statement for the table and any defaults/constraints and provide a list of any of the variable width columns that are suffering an update.  Without know what's going on in the updates, the 50% FILL FACTOR may be nothing more than a complete waste of memory and disk space.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Friday, January 4, 2019 10:56 AM

    RGP - Friday, January 4, 2019 10:16 AM

    Jeff Moden - Thursday, January 3, 2019 6:24 PM

    RGP - Thursday, January 3, 2019 1:01 PM

    Guys,
    I need to improve this Insert in a faster way

    Query :
    set identity_insert [Test].dbo.JOB ON
    INSERT INTO [Test].[dbo].JOB (
       [JOB_ID]
      ,[JOB_DEFINITION_ID]
      ,[PROCESS_ID]
      ,[PARENT_JOB_ID]
      ,[PREVIOUS_BATCH_ID]
      ,[FROM_ID]
      ,[FROM_SOURCE_ID]
      ,[FROM_SESSION_ID]
      ,[TO_ID]
      ,[TO_SOURCE_ID]
      ,[TO_SESSION_ID]
      ,[JOB_STATUS]
      ,[JOB_INFO]
      ,[LOCKED_BY]
      ,[UDATE]
      ,[CDATE]
    )
    SELECT
     [JOB_ID]
      ,[JOB_DEFINITION_ID]
      ,[PROCESS_ID]
      ,[PARENT_JOB_ID]
      ,[PREVIOUS_BATCH_ID]
      ,[FROM_ID]
      ,[FROM_SOURCE_ID]
      ,[FROM_SESSION_ID]
      ,[TO_ID]
      ,[TO_SOURCE_ID]
      ,[TO_SESSION_ID]
      ,[JOB_STATUS]
      ,[JOB_INFO]
      ,[LOCKED_BY]
      ,[UDATE]
      ,[CDATE]
     FROM [CR_Test].DBO.JOB with(nolock)
    set identity_insert [Test].dbo.JOB OFF

    Source table has : 11724192

    Any suggestions? thx

    What indexes does this table have and are there any FKs?  Also, any chance of you temporarily slipping the database into the BULK-LOGGED Recovery Model?

    Also, not many people know it but setting IDENTITY INSERT ON frequently results in a full sort of all the data using SORT_IN_TEMPDB behind the scenes.  I did find a link where a fellow demonstrates a way around that using SWITCH but I've not yet tried it.

    Thanks for the suggestions.
    The target table has one clustered Index.

    Please see below.

    I will try setting the target DB to bulk logged recovery mode and see how it works./****** Object: Index [ClusteredIndex-20190103-142533]  Script Date: 2019-01-04 1:13:55 PM ******/
    CREATE CLUSTERED INDEX [ClusteredIndex-20190103-142533] ON [dbo].[JOB]
    (
        [JOB_ID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 50)
    GO

    Careful...don't jump the gun.  Just setting the database to Bulk Logged isn't going to help.  It's just a pre-requisite for minimal logging.

    Here are more requirements to achieve minimal logging with a clustered index in place using INSERT/SELECT.
    1.  The object of the INSERT must be locked with a WITH(TABLOCK).
    2.  The INSERT must be done in the same order that the Clustered Index is.  I recommend you do an ORDER BY and SQL Sever will skip it if it's not needed.
    3.  It's undocumented but, through some really bad experiences, I've learned the hard way that if you have ANY variables in the INSERT/SELECT, you MUST use OPTION(RECOMPILE) or things might not be minimally logged.  This the time of the recompile is trivial compared to the actual INSERT, I ALWAYS include OPTION(RECOMPILE) in the INSERT/SELECT.

    Further, and like I said, the SET IDENTITY INSERT ON thing has an undocumented problem that may queer the whole deal by causing a totally unnecessary SORT_IN_TEMPDB.  I've not had the time to find the link that I was talking about that suggested a SWITCH  could be used to overcome that but I'll find it tonight.

    Last but not least, what the heck is up with the 50% FILL FACTOR?  šŸ˜‰  It won't help a thing for the sequential inserts you're doing.  Are you updating the data in the table after the load?  If so, post the CREATE TABLE statement for the table and any defaults/constraints and provide a list of any of the variable width columns that are suffering an update.  Without know what's going on in the updates, the 50% FILL FACTOR may be nothing more than a complete waste of memory and disk space.

    Thank you again for so many tips. There are no updates happening on the target table after Inserts. I would rather remove the index (let me know your thoughts? )

  • "It Depends".  What is the purpose of copying the data to the target table and what will the target table be used for?

    Also, does the target table start out with no rows or does it have rows already in it?  Apologies for not asking that before.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Thanks Jeff, no issues at all, Iā€™m so glad that you are so responsive to posts... the target table will be used just for archival....

    This target table will be empty.. we will probably do this process quite often ie as and when needed.. so we are likely to use the target table for read only purposes later on..thanks!!

  • RGP - Friday, January 4, 2019 7:21 PM

    Thanks Jeff, no issues at all, I’m so glad that you are so responsive to posts... the target table will be used just for archival....This target table will be empty.. we will probably do this process quite often ie as and when needed.. so we are likely to use the target table for read only purposes later on..thanks!!

    Perfect.  I'm working on the demo now.  And a Clustered Index is always worthwhile having on something like this even if it's only being used as a "uniquifier" column.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Ok... here we go...

    First, I wanted to make sure that I had the same volume of data you have.  I'd forgotten to ask you to post the DDL for your table so I swagged it.  The following code creates a test table similar to yours and then populated it with 11,724,192 rows like your table has.  Not to worry.... first lesson here is that "Minimal Logging" also helps you make test data on a table with a Clustered Index quickly (a minute and 33 seconds on my laptop) and does so without blowing out either your logfile when you populate the table or your MDF file because of Clustered Index build.  Comments are in the code and you should read them.

    --=====================================================================================================================
    --      Simulate RGP's table and populate it.  This is NOT a part of the solution. We're just building test data here.
    --      If you're already in the Bulk-Logged Recovery Model, this will all be Minimally Logged and takes only 00:01:33
    --      (hh:mi:ss) on my laptop.  If not, you might want to get there now because this can take substantially longer
    --      in the FULL Recovery Model.
    --=====================================================================================================================
    --  DROP TABLE dbo.JBMTestSOURCE
     CREATE TABLE dbo.JBMTestSOURCE
            (
             JOB_ID             INT IDENTITY(1,1)
            ,JOB_DEFINITION_ID  INT
            ,PROCESS_ID         INT
            ,PARENT_JOB_ID      INT
            ,PREVIOUS_BATCH_ID  INT
            ,FROM_ID            INT
            ,FROM_SOURCE_ID     INT
            ,FROM_SESSION_ID    INT
            ,TO_ID              INT
            ,TO_SOURCE_ID       INT
            ,TO_SESSION_ID      INT
            ,JOB_STATUS         CHAR(1)
            ,JOB_INFO           VARCHAR(50)
            ,LOCKED_BY          VARCHAR(50)
            ,UDATE              DATETIME
            ,CDATE              DATETIME
            CONSTRAINT PK_JBMTestSOURCE PRIMARY KEY CLUSTERED (JOB_ID ASC)
            )
    ;
    --=====  Create a substantial test table with the following columns and values.
     INSERT INTO dbo.JBMTestSOURCE WITH (TABLOCK) --Required for "Minimal Logging"
     SELECT  TOP (11724192) --The number of rows that RGP said were in the table
        --   JOB_ID             = IDENTITY(INT,1,1)
             JOB_DEFINITION_ID  = ABS(CHECKSUM(NEWID())%500000)+1 --Just random integers from 1 to 500,000
            ,PROCESS_ID         = ABS(CHECKSUM(NEWID())%500000)+1
            ,PARENT_JOB_ID      = ABS(CHECKSUM(NEWID())%500000)+1
            ,PREVIOUS_BATCH_ID  = ABS(CHECKSUM(NEWID())%500000)+1
            ,FROM_ID            = ABS(CHECKSUM(NEWID())%500000)+1
            ,FROM_SOURCE_ID     = ABS(CHECKSUM(NEWID())%500000)+1
            ,FROM_SESSION_ID    = ABS(CHECKSUM(NEWID())%500000)+1
            ,TO_ID              = ABS(CHECKSUM(NEWID())%500000)+1
            ,TO_SOURCE_ID       = ABS(CHECKSUM(NEWID())%500000)+1
            ,TO_SESSION_ID      = ABS(CHECKSUM(NEWID())%500000)+1
            ,JOB_STATUS         = CHAR(ABS(CHECKSUM(NEWID())%5)+65) --Random letters "A" thru 'E'
            ,JOB_INFO           = NEWID() --Doesn't matter... just something for filler
            ,LOCKED_BY          = NEWID() --Doesn't matter... just something for filler
            ,UDATE              = RAND(CHECKSUM(NEWID()))*DATEDIFF(dd,'2010','2020')+CAST('2010' AS DATETIME)
            ,CDATE              = RAND(CHECKSUM(NEWID()))*DATEDIFF(dd,'2010','2020')+CAST('2010' AS DATETIME)
       FROM      sys.all_columns t1
      CROSS JOIN sys.all_columns t2
     OPTION (RECOMPILE)  --Helps guarantee "Minimal Logging"
    ;
    --===== Let's see the first 100 rows so we know it worked.
     SELECT TOP 100 *
       FROM dbo.JBMTestSOURCE
    ;

    Up next, I needed a target table similar to the  source table.  I say "similar" because it's not identical to the source table and I strongly recommend you change your target table.  Since you said it's an Archive table and that it will never be updated, there absolutely no sense in having an IDENTITY column.  What you're probably not aware of is that SET IDENTITY INSERT ON currently causes a full table sort of all the rows and all the data in TEMPDB.  That takes a shedload of time, resources, and can cause a pot wad of contention in TEMPDB.

    Again, I didn't have the DDL for your table so I swagged it like I did the source table.  Your DDL may be different but make sure you don't use the IDENTITY property and make sure that you make the column NOT NULL so that we can still put a Clustered PK on it.  I've also included a link in the code that will lead you to the fault with SET IDENTITY INSERT ON that has been reported to MS almost a half decade ago and they haven't fixed it yet and doesn't look like they will anytime soon because it's still listed as "Unplanned".

    --=====================================================================================================================
    --      Create the target table (you'll need to rename it).  Since this is an ARCHIVE table that will never be updated,
    --      there's no need to preserve the IDENTITY property in the table.  It just slows things down and causes an
    --      unnecessary sort in TempDB when you turn SET IDENTITY INSERT ON.  See the following MS fault feedback item
    --      on that little undocumented pain in the socks.
    -- https://feedback.azure.com/forums/908035-sql-server/suggestions/32895409-bulk-insert-with-identity-column-creates-query-pl
    --=====================================================================================================================
    --  DROP TABLE dbo.JBMTestTARGET
     CREATE TABLE dbo.JBMTestTARGET
            (
             JOB_ID             INT NOT NULL --The table is identical except we don't need an IDENTITY column on this.
            ,JOB_DEFINITION_ID  INT
            ,PROCESS_ID         INT
            ,PARENT_JOB_ID      INT
            ,PREVIOUS_BATCH_ID  INT
            ,FROM_ID            INT
            ,FROM_SOURCE_ID     INT
            ,FROM_SESSION_ID    INT
            ,TO_ID              INT
            ,TO_SOURCE_ID       INT
            ,TO_SESSION_ID      INT
            ,JOB_STATUS         CHAR(1)
            ,JOB_INFO           VARCHAR(50)
            ,LOCKED_BY          VARCHAR(50)
            ,UDATE              DATETIME
            ,CDATE              DATETIME
            CONSTRAINT PK_JBMTestTARGET PRIMARY KEY CLUSTERED (JOB_ID ASC)
            )
    ;

    Alright... that's my test harness and that last part identifies the changes you need to make to your target table (again... you only need to remove the IDENTITY property and add a NOT NULL in its place if it doesn't already appear).  If for some reason you absolutely need to keep the IDENTITY property in the target table, post back because I have a work around for that that one of the Microsoft folks coughed up.  It's not much more difficult but the IDENTITY property still makes sucking sounds here.

    Now, for the problem at hand.

    Using the test harness I setup above, the following copies the 11+ million rows from the source table to the target table in a "Minimally Logged" fashion.  Again, there are details in the comments that you should read.  The bottom line is, it only takes 24 seconds on my laptop and that's with the Clustered Index already in place.

    --=====================================================================================================================
    --      We're ready for the faster way to do inserts from the source table.
    --      This to 00:00:24 (hh:mi:ss) on my laptop. Yeah... you're reading that correctly... 24 seconds.
    --=====================================================================================================================
    --===== Ensure that we're in the Bulk-Logged Recovery Model.
         -- You'll need to change the name of the database here.
      ALTER DATABASE [FragTest] SET RECOVERY BULK_LOGGED WITH NO_WAIT
    ;
    --===== You said the target table would be empty.  Let's make sure that's true although it's not necessary.  You can
         -- still get "Minimal Logging" even with rows in the table and a Clustered Index (only) since 2008.
            TRUNCATE TABLE dbo.JBMTestTARGET
    ;
    --===== Now, it's simple and fast and "Minimally Logged" when we transfer data to the target table.
     INSERT INTO dbo.JBMTestTARGET WITH (TABLOCK) --Required for "Minimal Logging"
            (
              JOB_ID, JOB_DEFINITION_ID, PROCESS_ID, PARENT_JOB_ID, PREVIOUS_BATCH_ID, FROM_ID, FROM_SOURCE_ID
            , FROM_SESSION_ID, TO_ID, TO_SOURCE_ID, TO_SESSION_ID, JOB_STATUS, JOB_INFO, LOCKED_BY, UDATE, CDATE
            )
     SELECT  JOB_ID            
            ,JOB_DEFINITION_ID 
            ,PROCESS_ID        
            ,PARENT_JOB_ID     
            ,PREVIOUS_BATCH_ID 
            ,FROM_ID           
            ,FROM_SOURCE_ID    
            ,FROM_SESSION_ID   
            ,TO_ID             
            ,TO_SOURCE_ID      
            ,TO_SESSION_ID     
            ,JOB_STATUS        
            ,JOB_INFO          
            ,LOCKED_BY         
            ,UDATE             
            ,CDATE             
       FROM dbo.JBMTestSOURCE
      ORDER BY JOB_ID    --Because the Clustered Indexes are identical, this isn't necessary. It's just a safeguard for "Minimal Logging".
     OPTION (RECOMPILE)  --Helps guarantee "Minimal Logging"
    ;
    --===== If your database started out in the FULL Recovery Model, change it back to that.
         -- You'll need to change the name of the database here.
      ALTER DATABASE [FragTest] SET RECOVERY FULL WITH NO_WAIT
    ;

    Heh... and now you can change the title of this thread from "Insert Faster Way" to "Insert Way Faster". šŸ˜€

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.s.  We really need to fix that 50% FILL FACTOR thing.  It's wasting a ton of memory.  We can look at that once we've solved your original problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • p.p.s.  BTW, what's the data type of that JOB_INFO  column?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Friday, January 4, 2019 11:55 PM

    p.p.s.  BTW, what's the data type of that JOB_INFO  column?

    Thanks Jeff, & I really appreciate your time on this, I'll test this out soon ...[JOB_INFO] [nvarchar](400) NULL, for your reference DDL of table is:
    .[JOB](
        [JOB_ID] [bigint] IDENTITY(1,1) NOT NULL,
        [JOB_DEFINITION_ID] [bigint] NOT NULL,
        [PROCESS_ID] [uniqueidentifier] NULL,
        [PARENT_JOB_ID] [bigint] NULL,
        [PREVIOUS_BATCH_ID] [uniqueidentifier] NULL,
        [FROM_ID] [nvarchar](400) NOT NULL,
        [FROM_SOURCE_ID] [bigint] NULL,
        [FROM_SESSION_ID] [bigint] NULL,
        [TO_ID] [nvarchar](400) NULL,
        [TO_SOURCE_ID] [bigint] NULL,
        [TO_SESSION_ID] [bigint] NULL,
        [JOB_STATUS] [bigint] NOT NULL,
        [JOB_INFO] [nvarchar](400) NULL,
        [LOCKED_BY] [uniqueidentifier] NULL,
        [UDATE] [datetime] NOT NULL,
        [CDATE] [datetime] NOT NULL
    ) ON [PRIMARY]

    GO

    ALTER TABLE [dbo].[JOB] ADD DEFAULT (getdate()) FOR [UDATE]
    GO

    ALTER TABLE [dbo].[JOB] ADD DEFAULT (getdate()) FOR [CDATE]
    GO

    "]

  • RGP - Saturday, January 5, 2019 3:37 AM

    Jeff Moden - Friday, January 4, 2019 11:55 PM

    p.p.s.  BTW, what's the data type of that JOB_INFO  column?

    Thanks Jeff, & I really appreciate your time on this, I'll test this out soon ...[JOB_INFO] [nvarchar](400) NULL, for your reference DDL of table is:
    .[JOB](
        [JOB_ID] [bigint] IDENTITY(1,1) NOT NULL,
        [JOB_DEFINITION_ID] [bigint] NOT NULL,
        [PROCESS_ID] [uniqueidentifier] NULL,
        [PARENT_JOB_ID] [bigint] NULL,
        [PREVIOUS_BATCH_ID] [uniqueidentifier] NULL,
        [FROM_ID] [nvarchar](400) NOT NULL,
        [FROM_SOURCE_ID] [bigint] NULL,
        [FROM_SESSION_ID] [bigint] NULL,
        [TO_ID] [nvarchar](400) NULL,
        [TO_SOURCE_ID] [bigint] NULL,
        [TO_SESSION_ID] [bigint] NULL,
        [JOB_STATUS] [bigint] NOT NULL,
        [JOB_INFO] [nvarchar](400) NULL,
        [LOCKED_BY] [uniqueidentifier] NULL,
        [UDATE] [datetime] NOT NULL,
        [CDATE] [datetime] NOT NULL
    ) ON [PRIMARY]

    GO

    ALTER TABLE [dbo].[JOB] ADD DEFAULT (getdate()) FOR [UDATE]
    GO

    ALTER TABLE [dbo].[JOB] ADD DEFAULT (getdate()) FOR [CDATE]
    GO

    "]

    Ok... thanks.  Even more concerning than the Job_Info column is the fact that the FROM_ID and TO_ID columns are defined as NVARCHAR(400) and not just for copy performance.  It's unusual for "ID" columns to have such a wide variable width and if they really are used as "IDs" in joins and the like, I can see there being some real performance issues in many other areas.

    Shifting gears back to the original problem, obviously the data will take longer to transfer than my 24 second example simply because the byte size of most columns in the real table is at least double that which I tested with.  Because of the NVARCHAR(400) columns, I wouldn't be surprised if the copy took up to 4 minutes or so.  Of course, that's a total swag on my part but just want to set expectations.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 22 total)

You must be logged in to reply to this topic. Login to reply