SQL Query Delete top 4000 rows when row count > 10K looping to reduce row count to 10K

  • Need a maintenance query for an alarm logging table to see if the row count is more than 200K rows and if so delete 4K rows until the total row count is 200K.  I want to commit each 4K delete before looping so the lock on the table is minimal and pending writes can be committed. I want to send an email with row count if the rows are above 200K 

    Table information
    USE [Alarms]
    GO
    SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE TABLE [GW-PLC].[AllEvent](
        [EventID] [uniqueidentifier] NOT NULL,
        [EventType] [int] NULL,
        [SourceName] [nvarchar](200) NULL,
        [SourcePath] [nvarchar](512) NULL,
        [SourceID] [uniqueidentifier] NULL,
        [ServerName] [nvarchar](50) NULL,
        [TicksTimeStamp] [bigint] NULL,
        [EventTimeStamp] [datetime2](7) NULL,
        [EventCategory] [nvarchar](50) NULL,
        [Severity] [int] NULL,
        [Priority] [int] NULL,
        [Message] [nvarchar](512) NULL,
        [ConditionName] [nvarchar](50) NULL,
        [SubConditionName] [nvarchar](50) NULL,
        [AlarmClass] [nvarchar](40) NULL,
        [Active] [bit] NULL,
        [Acked] [bit] NULL,
        [EffDisabled] [bit] NULL,
        [Disabled] [bit] NULL,
        [EffSuppressed] [bit] NULL,
        [Suppressed] [bit] NULL,
        [PersonID] [nvarchar](50) NULL,
        [ChangeMask] [int] NULL,
        [InputValue] [float] NULL,
        [LimitValue] [float] NULL,
        [Quality] [int] NULL,
        [EventAssociationID] [uniqueidentifier] NULL,
        [UserComment] [nvarchar](512) NULL,
        [ComputerID] [nvarchar](64) NULL,
        [Tag1Value] [nvarchar](128) NULL,
        [Tag2Value] [nvarchar](128) NULL,
        [Tag3Value] [nvarchar](128) NULL,
        [Tag4Value] [nvarchar](128) NULL,
        [Shelved] [bit] NULL,
        [AutoUnshelveTime] [datetime2](7) NULL,
        [GroupPath] [nvarchar](254) NULL,
    CONSTRAINT [PK_103_AllEvent] PRIMARY KEY NONCLUSTERED
    (
        [EventID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO

    This is 2017 Microsoft SQL server 14.0.2002.14

  • register 33943 - Tuesday, November 13, 2018 9:38 AM

    Need a maintenance query for an alarm logging table to see if the row count is more than 200K rows and if so delete 4K rows until the total row count is 200K.  I want to commit each 4K delete before looping so the lock on the table is minimal and pending writes can be committed. I want to send an email with row count if the rows are above 200K 

    Table information
    USE [Alarms]
    GO
    SET ANSI_NULLS ON
    GO

    SET QUOTED_IDENTIFIER ON
    GO

    CREATE TABLE [GW-PLC].[AllEvent](
        [EventID] [uniqueidentifier] NOT NULL,
        [EventType] [int] NULL,
        [SourceName] [nvarchar](200) NULL,
        [SourcePath] [nvarchar](512) NULL,
        [SourceID] [uniqueidentifier] NULL,
        [ServerName] [nvarchar](50) NULL,
        [TicksTimeStamp] [bigint] NULL,
        [EventTimeStamp] [datetime2](7) NULL,
        [EventCategory] [nvarchar](50) NULL,
        [Severity] [int] NULL,
        [Priority] [int] NULL,
        [Message] [nvarchar](512) NULL,
        [ConditionName] [nvarchar](50) NULL,
        [SubConditionName] [nvarchar](50) NULL,
        [AlarmClass] [nvarchar](40) NULL,
        [Active] [bit] NULL,
        [Acked] [bit] NULL,
        [EffDisabled] [bit] NULL,
        [Disabled] [bit] NULL,
        [EffSuppressed] [bit] NULL,
        [Suppressed] [bit] NULL,
        [PersonID] [nvarchar](50) NULL,
        [ChangeMask] [int] NULL,
        [InputValue] [float] NULL,
        [LimitValue] [float] NULL,
        [Quality] [int] NULL,
        [EventAssociationID] [uniqueidentifier] NULL,
        [UserComment] [nvarchar](512) NULL,
        [ComputerID] [nvarchar](64) NULL,
        [Tag1Value] [nvarchar](128) NULL,
        [Tag2Value] [nvarchar](128) NULL,
        [Tag3Value] [nvarchar](128) NULL,
        [Tag4Value] [nvarchar](128) NULL,
        [Shelved] [bit] NULL,
        [AutoUnshelveTime] [datetime2](7) NULL,
        [GroupPath] [nvarchar](254) NULL,
    CONSTRAINT [PK_103_AllEvent] PRIMARY KEY NONCLUSTERED
    (
        [EventID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]
    GO

    This is 2017 Microsoft SQL server 14.0.2002.14

    What are the rules for deciding which rows will be deleted? Is there a date column which might identify which rows are "oldest"?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Good question.
    Yes, This column is good with event date/time stamp [EventTimeStamp] [datetime2](7) NULL,
    Definitely want to delete the oldest records first.
    Couldn't figure out how to edit the title.  The body has the correct row numbers.

  • You need to use the TOP operator in your delete statement, something like this:

    DELETE TOP(n) FROM <TABLE> WHERE <CONDITIONS>

    I made an article for MSSQLTips explaining how to implement DML sentences in batches, you can take a look and maybe you can give it a try:

    https://www.mssqltips.com/sqlservertip/5783/optimize-large-sql-server-dml-processes-by-using-batches/

    MCSE, MCSA SQL Server Database Developer/Administrator

  • epivaral - Tuesday, November 13, 2018 10:33 AM

    You need to use the TOP operator in your delete statement, something like this:

    DELETE TOP(n) FROM <TABLE> WHERE <CONDITIONS>

    I made an article for MSSQLTips explaining how to implement DML sentences in batches, you can take a look and maybe you can give it a try:

    https://www.mssqltips.com/sqlservertip/5783/optimize-large-sql-server-dml-processes-by-using-batches/

    The examples in the MSSQLTips article don't use ORDER BY. If an ordinary index exists which facilitates more efficient retrieval of the rows to be processed than the clustered index, then the order of retrieval will be by the keys of that index, not the clustered index, and the value of @id_control will be somewhat unpredictable.
    Setting and using this value of @id_control also assumes that rows in a clustered index are guaranteed to be retrieved in cluster key order, which is not necessarily the case.
    Also, the looping mechanism will perform an unnecessary final loop after all qualifying rows have been updated.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thank you for the information.  It is very helpful.
    The link explains how to handle the loop process.  Why does it say "but for it to have meaning, batch size must be less than 50% of the expected rows to be processed...  I have 48 million rows, want to minimize locking so that is why I wanted to keep the batch size 4K rows.  Is there a flaw in my thought process?
    I am not clear how to insert the email process to send an email if the statement sees more than 20K and send nothing if the row count is 20K or less.

  • register 33943 - Wednesday, November 14, 2018 7:14 AM

    Thank you for the information.  It is very helpful.
    The link explains how to handle the loop process.  Why does it say "but for it to have meaning, batch size must be less than 50% of the expected rows to be processed...  I have 48 million rows, want to minimize locking so that is why I wanted to keep the batch size 4K rows.  Is there a flaw in my thought process?
    I am not clear how to insert the email process to send an email if the statement sees more than 20K and send nothing if the row count is 20K or less.

    for batch size, i mean each loop (so for your case, you want to delete rows on 4k batches, so you are good to go), also please keep in mind that the link is just an example using a primary key as a control column, you have to adapt your delete to suit your needs (and also test it on another server first).

    For the email process, the simplest way i can think is to create a SQL Server agent job to run on a regular schedule, to check the table count, if the condition is met, use the stored procedure sp_send_dbmail to send a notification. Something like this:


    DECLARE @tableRows bigint
    SELECT @tableRows=COUNT(1) FROM [GW-PLC].[AllEvent]

    IF(@tableRows>200000) --checking if your table has more than 200,000 records
    BEGIN  -- Send the email 

    EXEC msdb.dbo.sp_send_dbmail  
    @profile_name = 'your mail profile',  
    @recipients = 'youraddress@email.com',  
    @body = '[GW-PLC].[AllEvent] table has more than 200K rows',  
    @subject = 'Automated Message' ;

      -- do your cleanup tasks here or somewhere else.

    END
    -- if less rows are in the table, do nothing.

    MCSE, MCSA SQL Server Database Developer/Administrator

  • I see the 2 parts.  Loop delete / email.  What is the best way to incorporate this?  How do I incorporate the loop delete statement into the last sql statement?
    Does it go here?  -- do your cleanup tasks here or somewhere else.
    Or are you telling me to have 2 processes.  1 to query the row count for the email notification and 1 to process for deletion loop of records 4K at a time?

  • register 33943 - Wednesday, November 14, 2018 8:20 AM

    I see the 2 parts.  Loop delete / email.  What is the best way to incorporate this?  How do I incorporate the loop delete statement into the last sql statement?
    Does it go here?  -- do your cleanup tasks here or somewhere else.
    Or are you telling me to have 2 processes.  1 to query the row count for the email notification and 1 to process for deletion loop of records 4K at a time?

    It depends on your needs, you can put it on the same IF statement, or do another process to perform the deletion. whichever option you choose, I strongly recommend to test it on a DEV environment before.

    MCSE, MCSA SQL Server Database Developer/Administrator

  • ok
    I will work on that
    I have the database copied into a test system to run the sql statements against.

  • you are basically trying to avoid locking and blocking, so here is another idea:

    you could also consider inserting the desired 10K rows into a new table, amybe in a different schema.
    then rename the tables or switch their schemas, which is a minimal metadata operation:
    schemas have the added advantage of being able to have the same constraint names.

    CREATE TABLE [Archive].[GW-PLC].....
    INSERT INTO [Archive].[GW-PLC]
    SELECT TOP 10000 * FROM [dbo].[GW-PLC] ORDER BY EventTimeStamp
    BEGIN TRANSACTION
    ALTER SCHEMA [STAGING] TRANSFER [dbo].[GW-PLC]
    ALTER SCHEMA [dbo] TRANSFER [Archive].[GW-PLC]
    ALTER SCHEMA [Archive] TRANSFER [STAGING].[GW-PLC]

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • good idea, thanks.

  • This works to send email but how can I insert the @tableRows in the body of the email?

    USE Alarms

    DECLARE @tableRows int
    SELECT @tableRows=COUNT(1) FROM [GW-PLC].[AllEvent]

    IF(@tableRows>200000) --checking if your table has more than 200,000 records
    BEGIN -- Send the email

    EXEC msdb.dbo.sp_send_dbmail
    @profile_name = [SQL-Email],
    @recipients = [email.com],
    @body = '[GW-PLC].[AllEvent] table has more than 200K rows',
    @subject = 'SQL Database MSG From GW-SQL1 regarding Alarms-GW-PLC.ALLEvent' ;

    END

    I have tried editing the line to this but SQL does not like it.

    @body = '[GW-PLC].[AllEvent] table has more than 200K rows, 1 STEP CLOSER' @tableRows

  • proc parameters can only assigning values you cannot concat values or use functions int he assignment.
    you simply  pre-assemble the value before assigning it.:
    USE Alarms

    DECLARE @tableRows int
    SELECT @tableRows=COUNT(1) FROM [GW-PLC].[AllEvent]

    IF(@tableRows>200000) --checking if your table has more than 200,000 records
    BEGIN -- Send the email
      DECLARE @NewBody varchar(max) = '[GW-PLC].[AllEvent] table has more than 200K rows' + '1 STEP CLOSER' + CONVERT(VARCHAR(30),@tableRows )

      EXEC msdb.dbo.sp_send_dbmail
      @profile_name = [SQL-Email],
      @recipients = [email.com],
      @body = @NewBody,
      @subject = 'SQL Database MSG From GW-SQL1 regarding Alarms-GW-PLC.ALLEvent' ;
    END

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • email worked flawlessly.  THANK YOU>>>>
    I need to create the delete loop of 4K records at a time until there are 200K records remaining by calling a stored procedure

Viewing 15 posts - 1 through 15 (of 24 total)

You must be logged in to reply to this topic. Login to reply