Purge Process Help

  • Hi all,

    I am attempting to do a rather simple purge task on a very large table. This task will need to take place daily and delete records older than 6 months out of the database. On first pass this will delete well over 130 million rows. I thought the best way to handle this is create a proc and call the proc from a SQL Agent Job that runs nightly. Here is an example of the script, I need some fine tuning assistance:

    CREATE PROCEDURE usp_Purge_WCFLogger

    AS

    SET NOCOUNT ON

    EXEC sp_rename 'dbo.logs', 'logs_work'

    GO

    SELECT * INTO dbo.Logs_Backup FROM dbo.Logs_Work WHERE TIMESTAMP < DATEADD(month, -6, GETDATE())

    --SELECT COUNT(*) FROM dbo.Logs_Work WHERE TIMESTAMP < DATEADD(month, -6, GETDATE())

    TRUNCATE TABLE dbo.Logs_Work

    INSERT INTO dbo.Logs_Work

    SELECT * FROM dbo.Logs_Backup

    EXEC sp_rename 'dbo.Logs_Work', 'dbo.Logs'

    GO

    GO

    I am not a very good stored procedure writer so any advice is welcomed.

    MegaDBA

  • PrettyDBA (1/9/2014)


    Hi all,

    I am attempting to do a rather simple purge task on a very large table. This task will need to take place daily and delete records older than 6 months out of the database. On first pass this will delete well over 130 million rows. I thought the best way to handle this is create a proc and call the proc from a SQL Agent Job that runs nightly. Here is an example of the script, I need some fine tuning assistance:

    CREATE PROCEDURE usp_Purge_WCFLogger

    AS

    SET NOCOUNT ON

    EXEC sp_rename 'dbo.logs', 'logs_work'

    GO

    SELECT * INTO dbo.Logs_Backup FROM dbo.Logs_Work WHERE TIMESTAMP < DATEADD(month, -6, GETDATE())

    --SELECT COUNT(*) FROM dbo.Logs_Work WHERE TIMESTAMP < DATEADD(month, -6, GETDATE())

    TRUNCATE TABLE dbo.Logs_Work

    INSERT INTO dbo.Logs_Work

    SELECT * FROM dbo.Logs_Backup

    EXEC sp_rename 'dbo.Logs_Work', 'dbo.Logs'

    GO

    GO

    I am not a very good stored procedure writer so any advice is welcomed.

    MegaDBA

    You have some issues here. First of all, you can't have a GO inside a stored proc. Secondly, not sure why you rename a table to just rename it back later. From what I can see there is no point in renaming anything here. The biggest issue is that your logic here is backwards. You want to delete everything that is older than 6 months. You are inserting into your holding table the data you want to delete, not the data you want to retain. You also have an issue that this sproc will not work after the first time you run it because you are using a select into but not dropping that table.

    You said this is going to run nightly? I think I would break this into two scripts. The first script is a doozy. You have to remove all the data that is older than 6 months. This may be a lot of data and take some serious amount of time. That is a one time thing. I would write a script to do that by itself. Then your daily process is nothing more than a simple delete.

    I would think you could do something along these lines. Here is the first step.

    if OBJECT_ID('Logs_Backup') is null

    SELECT * --should use column names here

    INTO dbo.Logs_Backup

    FROM dbo.Logs

    where 1 = 2

    --This will create the Logs_Backup with the same structure as Logs_Work only if the table does not exist

    insert dbo.Logs_Backup (need columns here)

    select (need columns here)

    FROM dbo.Logs

    WHERE TIMESTAMP > DATEADD(month, -6, GETDATE())

    TRUNCATE TABLE dbo.Logs

    INSERT INTO dbo.Logs

    SELECT * FROM dbo.Logs_Backup

    At this point your log table will hold all the data for the last 6 months with the rest of it removed.

    Now for your daily procedure.

    CREATE PROCEDURE usp_Purge_WCFLogger AS

    SET NOCOUNT ON

    Delete

    FROM dbo.Logs

    WHERE TIMESTAMP < DATEADD(month, -6, GETDATE())

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Thanks Steve,

    This is exactly why I came out here. In theory what I want to do sounds good, but as soon as I try to translate that into t-sql code, the meaning (what I'm trying to do) gets lost.

    I will work off of the guidelines you gave me. Thanks!!!

  • Glad that sounds like a plan that will work you. Make sure you test test test!!!

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply