How to delete in batches ?

  • Hi,

    I am working on a GPS related issue. There is a table where we keep geographical information of vehicles tracked using GPS devices. This devise will be sending data to the server every second. So we have a huge dump of data in the tables. There is a table for each day. Table structure is :

    TabsenPK Int

    DateTimestamp BigInt ( We keep in UTC format which is managed by JAVA)

    Longitude Decimal(16,9)

    Lattitude Decimal(16,9)

    Field1 Int

    Field2 Int

    Field3 Int

    Field4 Int ( Field1 to Field4 are used to keep few details )

    I need to delete all records from this table where DateTimeStamp is prior to a passed date. This will come to millions of records. What is the best method to delete this records with least performance impact ? How can I implement a batch based removal ?

  • I do not understand what you mean by batch based. You can issue only one delete statement with the datetimestamp in the WHERE clause and it will be one batch.

    If you are worried about the transaction log growth of a huge DELETE statement, you can split it into different smaller transactions by altering the where clause and put it through a loop.

    There is no way that you can turn of logging to speed the process up, but you can possibly alter your design (you mention you have a table for each day?) to truncate the whole table and it should be much quicker.

  • sanujss (5/23/2012)


    Hi,

    I am working on a GPS related issue. There is a table where we keep geographical information of vehicles tracked using GPS devices. This devise will be sending data to the server every second. So we have a huge dump of data in the tables. There is a table for each day. Table structure is :

    TabsenPK Int

    DateTimestamp BigInt ( We keep in UTC format which is managed by JAVA)

    Longitude Decimal(16,9)

    Lattitude Decimal(16,9)

    Field1 Int

    Field2 Int

    Field3 Int

    Field4 Int ( Field1 to Field4 are used to keep few details )

    I need to delete all records from this table where DateTimeStamp is prior to a passed date. This will come to millions of records. What is the best method to delete this records with least performance impact ? How can I implement a batch based removal ?

    you can use the top clause in your delete statement

    e.g.

    delete TOP (1000) from mytable where x=y

    you can also loop that statement with a delay if you like

    declare @rc bigint=1

    while (@rc>0)

    begin

    delete top(100) from #x

    set @rc=@@ROWCOUNT

    waitfor delay '00:00:05'

    end

    this might help you delete in batches , which would have less of a hit on transaction logs and avoid massive deletes that block users for hours on end

    you might have to try different quantities and different delays, but i usually find that there is a tipping point where the number of rows you delete causes the query to run way too long

    MVDBA

  • This article may help, http://www.sqlservercentral.com/articles/T-SQL/67898/.

  • I would have (probably clustered) index on your datetimestamp column. then do something like this:

    declare @error int, @rowcount int

    set @rowcount = 9999

    while @rowcount > 0

    begin

    begin tran

    delete top (50000) from gpstable where datetimestamp < 'somedatehere'

    select @error = @@ERROR, @rowcount = @@ROWCOUNT

    IF @error <> 0

    BEGIN

    ROLLBACK TRAN

    --report error?

    BREAK --get out of loop

    END

    COMMIT TRAN

    --delay maybe to keep query from hammering system?

    waitfor delay '00:00:01'

    end

    the DELETE should be validated to make index seeks and row or page locks, which will avoid locking up the table for concurrent reads/inserts. You may need to adjust the 50000 number to ensure get the right plan - although with a clustered index I think you will be guaranteed to get a seek regardless of the number

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Thanks to every one for the quick help. This works fine for a table. One more help I need.

    I have a partitioned view in which I have all these tables joined using union.

    If I try to delete using TOP , it will not allow.

    Also, I tried setting the rowcount to 10000 and run the delete. But it is ignoring the rowcount limit as it is a view.

    What is the alternative here ?

  • I would identify the tables that you need to delete records frm and delete them from the tables directly.

  • Lynn Pettis (5/24/2012)


    I would identify the tables that you need to delete records frm and delete them from the tables directly.

    +100

    MVDBA

  • Hi.. Thanks every one ... I have resolved this.

    I have dynamically created procedures for each table which meets the criteria, rather than running dynamic SQL to delete the data. Then executed the proc and dropped it. It worked well .

    Thanks all bros in sql server central.. They are always there for help.. Kudos bros..

  • Just in case you (or anybody else) would like to read yet another post on the topic, I've written about this.



    Twitter: @SQLife
    Email: sqlsalt(at)outlook(dot)com

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply