Deleting older data from Large table

  • I was deleting the data from a table older than 2 yrs and run the process for1 hr and later started deleting older than 1 yr worth of data and run the process for same 1 hr. However, noticed when i changed to 1 yr it started deleting more rows like 3 times more. Do you think querying older than 2 yrs and deleting the data would be slower than 1 yr? If yes, why? Please explain?

    Thanks

  • The more data you delete in one batch, the larger the log records and and the more locks taken. Often easier and faster to delete in batches rather than one large group.

  • I was deleting with the batch size only. However, it started deleting more rows when i change the where clause older than 1 yr and previously it was older than 2 yrs.

  • Are you asking why the deleting of data older than 1 year is running faster than deleting data older than 2 years?
    Is this a job that runs  on a schedule, purging data older than 1 year (previously 2 years) and that old data is not re-inserted?

    My first thought is that now you have a whole year less of data in the table than you did before (after the first delete, of course).
    So, theoretically, the table is about half the size it was before, meaning it the query has half as many rows to check their age.

  • 1)
    Are you deleting with TSQL command ? Batch size is in  Bulk insert and Integration services packages, AFAIK.
    To delete with "batch size" you have to do it in a loop with "delete top nnnn". Is this what you're doing?

    2)
    If remaining records are much less of deleting records, and there is not referential integrity , may be faster to create a new table with same structure and copy the records you have to mantain in the new table, drop the old table and rename the new table with the original name.

  • Admingod - Wednesday, November 14, 2018 11:08 AM

    I was deleting the data from a table older than 2 yrs and run the process for1 hr and later started deleting older than 1 yr worth of data and run the process for same 1 hr. However, noticed when i changed to 1 yr it started deleting more rows like 3 times more. Do you think querying older than 2 yrs and deleting the data would be slower than 1 yr? If yes, why? Please explain?

    Thanks

    Less the number of years included in where clause (which equals less records) query analyzer sees rows faster.  The process is that if there is an index it usually scans all the records for 2 years and then starts to delete the records.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply