Archiving Data Script

  • I wrote a script to archive and delete records rom a table back in 2005 and 2009.

    I can't seem to get the syntax right.

    Does anyone have a sample script to simply archive and delete records?

    Any help would be greatly appreciated. 🙂

    This is what I have so far.:blush:

    DECLARE @ArchiveDate Datetime

    SET @ArchiveDate = (SELECT TOP 1 DATEPART(yyyy,Call_Date)

    FROM tblCall

    ORDER BY Call_Date)

    --SELECT @ArchiveDate AS ArchiveDate

    DECLARE @Active bit

    SET @Active = (SELECT Active_Indicator FROM ArchiveDriver)

    WHILE (@ArchiveDate <=1000) AND @Active = 1

    BEGIN TRAN

    INSERT INTO [PrimDataArchive].[dbo].[tblCallArchive]

    SELECT *

    FROM [PrisimDataSource].[dbo].[tblCall]

    WHERE DATEPART(yyyy,Call_Date) = @ArchiveDate

    IF @@ERROR <> 0

    BEGIN

    ROLLBACK TRAN

    RAISERROR ('Error occured while copying data to Archive.dbo.Orders', 16, 1)

    RETURN -1

    END

    BEGIN

    DELETE FROM [PrisimDataSource].[dbo].[tblCall]

    WHERE DATEPART(yyyy,Call_Date) = @ArchiveDate

    IF @@ERROR <> 0

    BEGIN

    ROLLBACK TRAN

    RAISERROR ('Error occured while deleting data from dbo.Orders', 16, 1)

    RETURN -1

    END

    IF @@TRANCOUNT > 0

    BEGIN

    COMMIT TRAN

    RETURN 0

    END

    END

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • I need something like the code listed below but it has to archive, Rollback if there is an error.

    Something very simple not.

    The following code only deletes and there is no error handling.

    -- Gradual Delete Sample

    -- Glenn Berry

    -- August 2011

    -- https://sqlserverperformance.wordpress.com/

    -- Twitter: GlennAlanBerry

    SET NOCOUNT ON;

    -- Check space used by table before we begin

    EXEC sp_spaceused N'dbo.BigLoggingTable';

    -- Declare local variables

    DECLARE @NumberOfLoops AS int;

    SET @NumberOfLoops = 5000;

    DECLARE @CurrentLoop AS int;

    SET @CurrentLoop = 0

    DECLARE @DeleteSize bigint;

    SET @DeleteSize = 500;

    DECLARE @HighWaterMark bigint;

    SET @HighWaterMark = 382989078;

    WHILE @CurrentLoop < @NumberOfLoops

    BEGIN

    -- Just delete any xxx rows that are below the HighWaterMark

    DELETE

    FROM dbo.BigLoggingTable

    WHERE TransactionId IN

    (SELECT TOP(@DeleteSize) TransactionId

    FROM dbo.BigLoggingTable WITH (NOLOCK)

    WHERE TransactionId < @HighWaterMark);

    WAITFOR DELAY '00:00:00:50';

    SET @CurrentLoop = @CurrentLoop + 1;

    END

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • If you want to be able to rollback, then ditch the entire loop construct and just have the insert and the delete as single statements in a transaction with a try-catch block to roll back on any errors. Don't use @@Error, that's SQL 2000 and earlier error handling.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I know that I still have to add the TRY... Catch logic but how else can I improve this code?

    DECLARE @ArchiveDate DateTime

    DECLARE @Active bit

    DECLARE @RowCount int

    SET @Active = (SELECT Active_Indicator FROM ArchiveDriver)

    SET @ArchiveDate = '1990-01-01' -- Pass into a paramter

    --SELECT @ArchiveDate AS ArchiveDate

    SET ROWCOUNT 100

    SELECT *

    FROM [PrismDataSource].[dbo].[tblCall]

    WHERE Call_Date < = @ArchiveDate

    WHILE @Active = 1

    BEGIN

    BEGIN TRAN

    INSERT INTO PrismDataArchive.dbo.tblCallArchive

    SELECT *

    FROM [PrismDataSource].[dbo].[tblCall]

    WHERE Call_Date < = @ArchiveDate

    IF @@ERROR <> 0

    BEGIN

    ROLLBACK TRAN

    END

    END

    BEGIN

    SET ROWCOUNT 100

    DELETE

    FROM [PrismDataSource].[dbo].[tblCall]

    WHERE Call_Date < = @ArchiveDate

    IF @@ERROR <> 0

    BEGIN

    ROLLBACK TRAN

    END

    IF @@TRANCOUNT > 0

    BEGIN

    COMMIT TRAN

    END

    END

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Did you read this post from Gail?

    GilaMonster (8/10/2015)


    If you want to be able to rollback, then ditch the entire loop construct and just have the insert and the delete as single statements in a transaction with a try-catch block to roll back on any errors. Don't use @@Error, that's SQL 2000 and earlier error handling.

    There's your answer.

  • Yes I should go with Try, Catch but I was to loop so that I only commit so many records at a time.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (8/10/2015)


    Yes I should go with Try, Catch but I was to loop so that I only commit so many records at a time.

    What about the part about ditching the loop? She also mentioned @@Error. There were four parts to the post:

    1. Ditch the entire loop construct.

    2. Have the rt and the delete as single statements in a transaction.

    3. Try-catch block to roll back on any errors

    4. Don't use @@Error.

  • I had a Loop to limit the batch size. What is the alternative?

    I added the TRY CATCH.

    There are only 25 records and it is taking for ever.

    The only DML that is performed is the INSERT into the temporary table.

    I performed this years ago without any trouble.

    Any help would be greatly appreciated.

    CREATE TABLE #Archive (

    tblCallID int)

    DECLARE @Active bit

    SET @Active = (SELECT Active_Indicator FROM ArchiveDriver)

    DECLARE @ArchiveDate Datetime

    DECLARE @COUNTER int

    DECLARE @RowCount int

    SET @ArchiveDate = '1900-01-01' -- Will be past as a Parameter

    SET @COUNTER = 0

    -- There are only 25 records that match that criteria

    --SELECT *

    --FROM [PrismDataSource].[dbo].[tblCall]

    --WHERE Call_Date < @ArchiveDate

    WHILE @Active = 1

    BEGIN

    BEGIN TRY

    BEGIN TRANSACTION

    INSERT INTO #Archive

    SELECT Call_ID FROM [PrismDataSource].[dbo].[tblCall]

    WHERE Call_Date < @ArchiveDate

    INSERT INTO tblCallArchive

    SELECT *

    FROM [PrismDataSource].[dbo].[tblCall] -- WITH (NOLOCK

    WHERE Call_ID IN (SELECT Call_ID FROM #Archive)

    DELETE FROM [PrismDataSource].[dbo].[tblCall]

    WHERE Call_ID IN (SELECT Call_ID FROM #Archive)

    TRUNCATE TABLE #Archive

    COMMIT TRANSACTION

    END TRY

    BEGIN CATCH

    IF @@TRANCOUNT > 0

    ROLLBACK TRANSACTION;

    END CATCH

    BEGIN

    ROLLBACK TRAN

    END

    END

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (8/10/2015)


    I had a Loop to limit the batch size. What is the alternative?

    I added the TRY CATCH.

    There are only 25 records and it is taking for ever.

    The only DML that is performed is the INSERT into the temporary table.

    I performed this years ago without any trouble.

    Any help would be greatly appreciated.

    CREATE TABLE #Archive (

    tblCallID int)

    DECLARE @Active bit

    SET @Active = (SELECT Active_Indicator FROM ArchiveDriver)

    DECLARE @ArchiveDate Datetime

    DECLARE @COUNTER int

    DECLARE @RowCount int

    SET @ArchiveDate = '1900-01-01' -- Will be past as a Parameter

    SET @COUNTER = 0

    -- There are only 25 records that match that criteria

    --SELECT *

    --FROM [PrismDataSource].[dbo].[tblCall]

    --WHERE Call_Date < @ArchiveDate

    WHILE @Active = 1

    BEGIN

    BEGIN TRAN

    BEGIN TRY

    BEGIN TRANSACTION

    INSERT INTO #Archive

    SELECT Call_ID FROM [PrismDataSource].[dbo].[tblCall]

    WHERE Call_Date < @ArchiveDate

    INSERT INTO tblCallArchive

    SELECT *

    FROM [PrismDataSource].[dbo].[tblCall] -- WITH (NOLOCK

    WHERE Call_ID IN (SELECT Call_ID FROM #Archive)

    DELETE FROM [PrismDataSource].[dbo].[tblCall]

    WHERE Call_ID IN (SELECT Call_ID FROM #Archive)

    TRUNCATE TABLE #Archive

    COMMIT TRANSACTION

    END TRY

    BEGIN CATCH

    IF @@TRANCOUNT > 0

    ROLLBACK TRANSACTION;

    END CATCH

    BEGIN

    ROLLBACK TRAN

    END

    END

    Might be taking forever because you don't have any logic to change the value of @active after you enter the loop, so you have an endless loop.



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • oops, wrong window. 😎



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Welsh Corgi (8/10/2015)


    I had a Loop to limit the batch size.

    If you want it in a transaction so you can roll it back, it is pointless to do it in batches. Pick which one you want, the ability to roll the entire thing back, or the ability to do it in small chunks that commit individually.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (8/10/2015)


    Welsh Corgi (8/10/2015)


    I had a Loop to limit the batch size.

    If you want it in a transaction so you can roll it back, it is pointless to do it in batches. Pick which one you want, the ability to roll the entire thing back, or the ability to do it in small chunks that commit individually.

    My intent was to do it in small chunks so that it does not take forever and lock users out of the table.

    that commit individually?

    Thanks.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • One criticism would be using "WHERE DATEPART(yyyy,Call_Date)" everywhere. It is much better to set @ArchiveDate to the oldest instant you want to keep (say midnight Jan 1 of the year after the year to be archived), and use "WHERE Call_Date < @ArchiveDate".

    If you can use the OUTPUT clause with DELETE, the delete & archive can be one statement and no transaction is needed. This requires that tblCall is not a view, and tblCallArchive has no triggers, check constraints, rules, or foreign keys.

    DECLARE @Batchsize INT;

    DECLARE @ArchiveDate DATETIME;

    SELECT @Batchsize INT = 5000; -- Used to preset @@ROWCOUNT

    WHILE @@ROWCOUNT > 0 AND (SELECT Active_Indicator FROM ArchiveDriver) = 1 BEGIN

    DELETE TOP (@Batchsize)

    FROM [PrisimDataSource].[dbo].[tblCall]

    OUTPUT DELETED.* INTO [PrimDataArchive].[dbo].[tblCallArchive]

    WHERE Call_Date < @ArchiveDate;

    END

    I omitted the try-catch block because there is no transaction to rollback. The calling program might put a try-catch around the EXEC for this procedure, if there is some error handling you want to do there.

    If OUTPUT is not an option and separate INSERT and DELETE statements are required, you need a transaction. But your original code does not guarantee the INSERT and DELETE are handling the same rows (no ORDER BY). A more robust solution would be to fill a table variable with the clustered key values for the next batch of rows, then use them for both statements. This is written as though the table has a single-column integer primary key, named PK. If it is a compound key, the "WHERE PK IN" clauses could be replaced by an INNER JOIN or even WHERE EXISTS. The CATCH block is very basic, I would normally put more transaction state checking around the COMMIT and ROLLBACK but that was not the point of this post.

    DECLARE @keys TABLE (K1 INT NOT NULL PRIMARY KEY CLUSTERED);

    WHILE 1=1 BEGIN

    INSERT INTO @keys ( K1 )

    SELECT TOP (@Batchsize) PK

    FROM [PrisimDataSource].[dbo].[tblCall]

    WHERE Call_Date < @ArchiveDate;

    IF @@ROWCOUNT = 0 OR (SELECT Active_Indicator FROM ArchiveDriver) = 0 BREAK;

    BEGIN TRY

    BEGIN TRAN

    INSERT INTO [PrimDataArchive].[dbo].[tblCallArchive]

    SELECT *

    FROM [PrisimDataSource].[dbo].[tblCall]

    WHERE PK IN (SELECT PK FROM @keys);

    DELETE FROM [PrisimDataSource].[dbo].[tblCall]

    WHERE PK IN (SELECT PK FROM @keys);

    COMMIT TRAN

    END TRY

    BEGIN CATCH

    ROLLBACK TRAN

    RAISERROR()

    RETURN -1

    END CATCH

    DELETE @keys;

    END

  • Great code but it is not doing what I would expect.

    Before I run the script for the first time I have the following in the table.

    CallDateYearRecordDate

    NULL46

    189924

    19981295

    19997967

    20006600

    200150483

    200267137

    200370068

    200467311

    200568502

    200668595

    200784438

    2008104377

    2009151171

    2010200165

    2011232679

    2012242464

    2013285255

    2014355890

    2015236886

    After I run it there are the same number of records in the Source Table (tblCall).

    However in the Archive table I have records.

    CallDateYearRecordDate

    NULL46

    189925

    I set the ArchiveDate to SET @ArchiveDate = '1900-01-01'

    What am I doing wrong?

    NULL46

    189925

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • It sounds like this is a maintenance process that should be run off-hours.

    You have calls in your table from the year 1899? :ermm:

    Did you try to update the code to do what you want it to do? What do you have now?

Viewing 15 posts - 1 through 15 (of 32 total)

You must be logged in to reply to this topic. Login to reply