Suggestions on How to Improve Delete

  • Hi Guys, 
    Any suggestions on how to improve this Delete?
    delete was causing tempdb log to grow, also would UPDATE be faster than Delete , there are around 31 million records to delete...thanks

    DECLARE  @return_value int
    EXEC  @return_value = [dbo].[sp_Batches]
       @TARGET_JOB_DEFID =78,
       @TARGET_JOB_STATUS = -7,
       @TARGET_DELETE_BATCHSIZE = 100000,
       @TARGET_DELETE_BATCHES = 10

  • rinu philip - Thursday, December 27, 2018 4:13 PM

    Hi Guys, 
    Any suggestions on how to improve this Delete?
    delete was causing tempdb log to grow, also would UPDATE be faster than Delete , there are around 31 million records to delete...thanks

    DECLARE  @return_value int
    EXEC  @return_value = [dbo].[sp_Batches]
       @TARGET_JOB_DEFID =78,
       @TARGET_JOB_STATUS = -7,
       @TARGET_DELETE_BATCHSIZE = 100000,
       @TARGET_DELETE_BATCHES = 10

    Without the code for the stored procedure how are we supposed to know if it can be improved?

  • Thanks for the quick response, I’ll post the code soon

  • RGP - Thursday, December 27, 2018 4:33 PM

    Thanks for the quick response, I’ll post the code soon

    Also post how many "records" there are total in the table, please.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Lynn Pettis - Thursday, December 27, 2018 4:20 PM

    rinu philip - Thursday, December 27, 2018 4:13 PM

    Hi Guys, 
    Any suggestions on how to improve this Delete?
    delete was causing tempdb log to grow, also would UPDATE be faster than Delete , there are around 31 million records to delete...thanks

    DECLARE  @return_value int
    EXEC  @return_value = [dbo].[sp_Batches]
       @TARGET_JOB_DEFID =78,
       @TARGET_JOB_STATUS = -7,
       @TARGET_DELETE_BATCHSIZE = 100000,
       @TARGET_DELETE_BATCHES = 10

    Without the code for the stored procedure how are we supposed to know if it can be improved?

    Please see the attached SP

  • Jeff Moden - Thursday, December 27, 2018 4:49 PM

    RGP - Thursday, December 27, 2018 4:33 PM

    Thanks for the quick response, I’ll post the code soon

    Also post how many "records" there are total in the table, please.

    The table has around 30 million records

  • For those that don't want to download the file (formatting courtesy of SSC paste interpreter):
    SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE PROCEDURE [dbo].[sp_Batches]
      @TARGET_JOB_DEFID INT,
      @TARGET_JOB_STATUS INT,
      @TARGET_DELETE_BATCHSIZE INT = 10000,
      @TARGET_DELETE_BATCHES INT = 100
    AS
    BEGIN
      DECLARE @NUM_JOBS_TO_DELETE INT;
      SET @NUM_JOBS_TO_DELETE = @TARGET_DELETE_BATCHSIZE * @TARGET_DELETE_BATCHES;

      SELECT TOP (@NUM_JOBS_TO_DELETE) j.[JOB_ID],
        ROW_NUMBER() OVER (ORDER BY [JOB_ID]) / @TARGET_DELETE_BATCHSIZE AS [BATCH_NUM]
       INTO [#TMP_JOB_TO_DELETE]
        FROM [dbo].[JOB] j WITH(READPAST)
          WHERE j.[JOB_DEFINITION_ID] = @TARGET_JOB_DEFID AND j.[JOB_STATUS] = @TARGET_JOB_STATUS;

      IF (@@ERROR <> 0)
      BEGIN
       GOTO Failed;
      END

      CREATE CLUSTERED INDEX [IX_TMP_JOB_TO_DELETE] ON [#TMP_JOB_TO_DELETE]([BATCH_NUM]);
      IF (@@ERROR <> 0)
      BEGIN
       GOTO Failed;
      END

      CREATE TABLE [#TMP_JOB_DELETE_ID]([JOB_ID] bigint);

      IF ((SELECT COUNT(*) FROM [#TMP_JOB_TO_DELETE]) > 0)
      BEGIN
       DECLARE @DELETED_JOB_COUNT int;
       SET @DELETED_JOB_COUNT = 0;

       DECLARE @BATCH_NUMBER int;
       SET @BATCH_NUMBER = 0;

       DECLARE @CURRENT_BATCH_SIZE int;

       WHILE (1 = 1)
       BEGIN
         TRUNCATE TABLE [#TMP_JOB_DELETE_ID];

        INSERT INTO [#TMP_JOB_DELETE_ID]([JOB_ID])
        SELECT [JOB_ID]
          FROM [#TMP_JOB_TO_DELETE]
           WHERE BATCH_NUM = @BATCH_NUMBER;

        IF (@@ERROR <> 0)
        BEGIN
          GOTO Failed;
        END

            SET @CURRENT_BATCH_SIZE = (SELECT COUNT(*) FROM [#TMP_JOB_DELETE_ID]);
        IF (@CURRENT_BATCH_SIZE = 0)
        BEGIN
          BREAK;
        END

       
                BEGIN TRANSACTION;
        DELETE FROM [dbo].[JOB]
           WHERE [JOB_ID] IN (SELECT [JOB_ID] FROM [#TMP_JOB_DELETE_ID]);

                COMMIT TRANSACTION;
                IF (@@ERROR <> 0)
        BEGIN
          GOTO Failed;
        END
        
        SET @DELETED_JOB_COUNT = @DELETED_JOB_COUNT + @CURRENT_BATCH_SIZE;

       
        SET @BATCH_NUMBER = @BATCH_NUMBER + 1;
       END

       SELECT @DELETED_JOB_COUNT;
       DROP TABLE [#TMP_JOB_TO_DELETE];
       DROP TABLE [#TMP_JOB_DELETE_ID];
      
       RETURN;
      END

    Failed:
      SELECT 0;
    END
    GO

    Thom~

    Excuse my typos and sometimes awful grammar. My fingers work faster than my brain does.
    Larnu.uk

  • RGP - Friday, December 28, 2018 6:45 AM

    Jeff Moden - Thursday, December 27, 2018 4:49 PM

    RGP - Thursday, December 27, 2018 4:33 PM

    Thanks for the quick response, I’ll post the code soon

    Also post how many "records" there are total in the table, please.

    The table has around 30 million records

    OK.  I'm a little confused by that.  The table has around 30 million records but, according to your original post, you want to delete 31 million records??? :blink:

    Let me ask a different way...
    1.  How many row currently exist in the table?
    2.  Now many rows do you want to delete from the table?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Friday, December 28, 2018 7:17 AM

    RGP - Friday, December 28, 2018 6:45 AM

    Jeff Moden - Thursday, December 27, 2018 4:49 PM

    RGP - Thursday, December 27, 2018 4:33 PM

    Thanks for the quick response, I’ll post the code soon

    Also post how many "records" there are total in the table, please.

    The table has around 30 million records

    OK.  I'm a little confused by that.  The table has around 30 million records but, according to your original post, you want to delete 31 million records??? :blink:

    Let me ask a different way...
    1.  How many row currently exist in the table?
    2.  Now many rows do you want to delete from the table?

    1.  How many row currently exist in the table? => 63 million 
    2.  Now many rows do you want to delete from the table? ==> 31 million

  • Why not simply delete batches until there are no more left?  I don't see why the temp tables and transaction are necessary, the procedure just exits if an error occurs.

    WHILE 1=1 BEGIN
      DELETE TOP (@TARGET_DELETE_BATCHSIZE)
      FROM dbo.JOB
      WHERE JOB_DEFINITION_ID = @TARGET_JOB_DEFID AND JOB_STATUS = @TARGET_JOB_STATUS;

      IF @@ROWCOUNT = 0   BREAK;
    END

  • RGP - Friday, December 28, 2018 8:49 AM

    Jeff Moden - Friday, December 28, 2018 7:17 AM

    RGP - Friday, December 28, 2018 6:45 AM

    Jeff Moden - Thursday, December 27, 2018 4:49 PM

    RGP - Thursday, December 27, 2018 4:33 PM

    Thanks for the quick response, I’ll post the code soon

    Also post how many "records" there are total in the table, please.

    The table has around 30 million records

    OK.  I'm a little confused by that.  The table has around 30 million records but, according to your original post, you want to delete 31 million records??? :blink:

    Let me ask a different way...
    1.  How many row currently exist in the table?
    2.  Now many rows do you want to delete from the table?

    1.  How many row currently exist in the table? => 63 million 
    2.  Now many rows do you want to delete from the table? ==> 31 million

    It would likely be much more efficient if you did a minimally logged copy of the rows you want to keep into a new table, drop the original table, and rename the new table the same as the old one.

    Otherwise, just use a script similar to what Scott Coleman used above.  There's just one little change to prevent a final massive expensive scan at the end of the run....  change the WHERE clause to look for rowcounts less than the batch size.  Like this...

      IF @@ROWCOUNT < @TARGET_DELETE_BATCHSIZE  BREAK;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Wonderful thanks!!

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply