Delete duplicate records from Archived Table the exist in the unarchived table.

  • I need to deletes the records that exist in the PrismData.dbo.tblCall Table that exist in the Archived Table'PrismDataArchive.dbo.tblCallArchive '

    Here are the duplicates:

    SELECT COUNT(*)

    FROM PrismData.dbo.tblCall AS Call_Prod

    INNER JOIN PrismDataArchive.dbo.tblCallArchive AS arc ON Call_Prod.Call_ID = arc.Call_ID

    I want to insert the records to be deleted to be safe.

    I know this is easy. I had to work over the Christmas Holiday and I'm so tired.

    Any help would be greatly appreciated.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • SELECT *

    --INTO ZZZ_tblCall_ArchivedSafe

    FROM PrismData.dbo.tblCall AS Call_Prod

    INNER JOIN PrismDataArchive.dbo.tblCallArchive AS arc ON Call_Prod.Call_ID = arc.Call_ID

    I need a *.arc or something.

    Happy holidays.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • I would start by verifying the rows you want to delete like this. You didn't post the DDL, so I don't have your column list. You'll need to substitute your columns for column_list.

    WITH cteToDelete AS (

    SELECT column_list

    FROM PrismData.dbo.tblCall

    INTERSECT

    SELECT column_list

    FROM PrismDataArchive.dbo.tblCallArchive

    )

    SELECT column_list

    FROM cteToDelete;

    Once you know that you have the rows you want to delete, you can simply use the result set to perform the delete. You'll have to change the primary_key to the name of your primary key. Because of the intersect, you can't delete directly from the CTE.

    WITH cteToDelete AS (

    SELECT column_list

    FROM PrismData.dbo.tblCall

    INTERSECT

    SELECT column_list

    FROM PrismDataArchive.dbo.tblCallArchive

    )

    DELETE FROM PrismData.dbo.tblCall

    WHERE primary_key IN (SELECT primary_key FROM cteToDelete);

    Note that there are several ways to approach this problem. DELETE FROM...WHERE IN, DELETE FROM...WHERE EXISTS, etc. The approach I posted here gives you a verification step so you don't make a mistake in production.

    BTW, shouldn't the procedure that archives the data to the archive database handle deletion of the archived rows from the current table?

  • It actually isn't quite as easy as you probably think it is. There are at least 4 issues that can cause you pain:

    1) locking/blocking

    2) crushing your server (IO reads, tlog and dirty page writes, CPU & RAM (if you wind up with bad plan or huge hashes))

    3) tlog size, or G-d forbid filling it up

    You didn't mention your table construct, but I will assume your ID is the ubiquitous int identity 1,1 that EVERY SQL Server table seems to have as it's clustered PK. If so, pseudo code gets you the following:

    get min and max IDs that you need to archive

    START LOOP between min up to max ID in some batch size (10K, 100K?) that will ensure

    a) index seeks

    b) row or page locks only (can force those, but be careful!!) to avoid long blocks

    c) don't crush server for too long

    start EXPLICIT transaction (VERY important)

    INSERT rows to be deleted where you need them to go (ID >= @counter AND ID < @counter + batchsize)

    error checking

    DELETE same rows

    error checking

    COMMIT

    increment @counter

    WAITFOR DELAY 'some small time' --allow server some breathing room

    LOOP

    OPTIONAL?? set up monitoring in the loop to check tlog percent full (dbcc sqlperf(logspace)) and pause for a tlog backup if necessary. Obviously if you are in SIMPLE recovery mode this will be unnecessary

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I just need to convert the following into a Delete Statement for the records in Database PrismDsta.

    Thank you.

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • What is wrong with the following Statement?

    I need to delete from PrismData.dbo.tblCall .

    This is so simple; I can't recall the syntax.

    I had a rough Holiday.

    DELETE

    FROM PrismData.dbo.tblCall AS Call_Prod

    INNER JOIN PrismDataArchive.dbo.tblCallArchive AS arc ON Call_Prod.Call_ID = arc.Call_ID

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (12/28/2015)


    I just need to convert the following into a Delete Statement for the records in Database PrismDsta.

    Thank you.

    Okay, but I don't know that you can delete from two tables at the same time. I thought you only wanted to delete from the current database where they were in the archive database.

  • Ed Wagner (12/28/2015)


    I would start by verifying the rows you want to delete like this. You didn't post the DDL, so I don't have your column list. You'll need to substitute your columns for column_list.

    WITH cteToDelete AS (

    SELECT column_list

    FROM PrismData.dbo.tblCall

    INTERSECT

    SELECT column_list

    FROM PrismDataArchive.dbo.tblCallArchive

    )

    SELECT column_list

    FROM cteToDelete;

    Once you know that you have the rows you want to delete, you can simply use the result set to perform the delete. You'll have to change the primary_key to the name of your primary key. Because of the intersect, you can't delete directly from the CTE.

    WITH cteToDelete AS (

    SELECT column_list

    FROM PrismData.dbo.tblCall

    INTERSECT

    SELECT column_list

    FROM PrismDataArchive.dbo.tblCallArchive

    )

    DELETE FROM PrismData.dbo.tblCall

    WHERE primary_key IN (SELECT primary_key FROM cteToDelete);

    Note that there are several ways to approach this problem. DELETE FROM...WHERE IN, DELETE FROM...WHERE EXISTS, etc. The approach I posted here gives you a verification step so you don't make a mistake in production.

    BTW, shouldn't the procedure that archives the data to the archive database handle deletion of the archived rows from the current table?

    Do I have to specify all of the columns? The table has over a hundred columns.:w00t:

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (12/28/2015)


    Ed Wagner (12/28/2015)


    I would start by verifying the rows you want to delete like this. You didn't post the DDL, so I don't have your column list. You'll need to substitute your columns for column_list.

    WITH cteToDelete AS (

    SELECT column_list

    FROM PrismData.dbo.tblCall

    INTERSECT

    SELECT column_list

    FROM PrismDataArchive.dbo.tblCallArchive

    )

    SELECT column_list

    FROM cteToDelete;

    Once you know that you have the rows you want to delete, you can simply use the result set to perform the delete. You'll have to change the primary_key to the name of your primary key. Because of the intersect, you can't delete directly from the CTE.

    WITH cteToDelete AS (

    SELECT column_list

    FROM PrismData.dbo.tblCall

    INTERSECT

    SELECT column_list

    FROM PrismDataArchive.dbo.tblCallArchive

    )

    DELETE FROM PrismData.dbo.tblCall

    WHERE primary_key IN (SELECT primary_key FROM cteToDelete);

    Note that there are several ways to approach this problem. DELETE FROM...WHERE IN, DELETE FROM...WHERE EXISTS, etc. The approach I posted here gives you a verification step so you don't make a mistake in production.

    BTW, shouldn't the procedure that archives the data to the archive database handle deletion of the archived rows from the current table?

    Do I have to specify all of the columns? The table has over a hundred columns.:w00t:

    What do you think? I'm sure you thought about your question before you posted it and I'm positive that you know more about your table than I do. I would think you should specify enough columns to make sure that you're uniquely identifying each row and accurately identifying the ones that are in both tables. The INTERSECT will work on whatever columns you tell it to, so make sure you tell it to use the rows to positively identify the rows that need to be deleted. You can verify it by SELECTing the rows before you delete them.

    As a bit of a sidebar, in SSMS, you can open the table object, and then click and drag the Columns node into the query window. It will give you a comma-separated list of the column names - all of them. You can then edit the list down from there. Just an SSMS tip I thought you might find useful from time to time.

  • I very much appreciate all of the help.

    I'm only dealing with 50,,, records.

    When I performed the archive I had a lookup table that stored the Batch Size etc.

    Can I do something as simple as the following which has a syntax error?

    BEGIN TRANSACTION

    DELETE FROM PrismData.dbo.tblCall

    INNER JOIN PrismDataArchive.dbo.tblCallArchive ON Call_Prod.Call_ID = arc.Call_ID

    -- Check results and perform a COMMIT Transactio or rollback

    -- COMMIT TRANSACTION

    -- ROLLBACK TRANSACTION

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (12/28/2015)


    I very much appreciate all of the help.

    I'm only dealing with 50,,, records.

    When I performed the archive I had a lookup table that stored the Batch Size etc.

    Can I do something as simple as the following which has a syntax error?

    BEGIN TRANSACTION

    DELETE FROM PrismData.dbo.tblCall

    INNER JOIN PrismDataArchive.dbo.tblCallArchive ON Call_Prod.Call_ID = arc.Call_ID

    -- Check results and perform a COMMIT Transactio or rollback

    -- COMMIT TRANSACTION

    -- ROLLBACK TRANSACTION

    If it's a syntax error, then you aren't going to be able to do it. Think about why - you're specifying to delete from two tables at the same time. That's why I posted a way to get the rows to delete and then to delete them.

  • Welsh Corgi (12/28/2015)


    I just need to convert the following into a Delete Statement for the records in Database PrismDsta.

    Thank you.

    Your code is very cleaver and the other post that were made.

    Thanks everyone.

    The following code is simplistic compared to the post that were made.

    I appreciate it very much.

    Do you see a problem with the following?

    BEGIN TRANSACTION

    DELETE Call_Prod

    FROM PrismData.dbo.tblCall AS Call_Prod

    INNER JOIN PrismDataArchive.dbo.tblCallArchive AS arc ON Call_Prod.Call_ID = arc.Call_ID

    -- ROLLBACK TRANSACTION

    -- COMMIT TRANSACTION

    For better, quicker answers on T-SQL questions, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Welsh Corgi (12/28/2015)


    I very much appreciate all of the help.

    I'm only dealing with 50,,, records.

    When I performed the archive I had a lookup table that stored the Batch Size etc.

    Can I do something as simple as the following which has a syntax error?

    BEGIN TRANSACTION

    DELETE FROM PrismData.dbo.tblCall

    INNER JOIN PrismDataArchive.dbo.tblCallArchive ON Call_Prod.Call_ID = arc.Call_ID

    -- Check results and perform a COMMIT Transactio or rollback

    -- COMMIT TRANSACTION

    -- ROLLBACK TRANSACTION

    Try something like this (non-ANSI-Standard TSQL). I know this, or something like it, works for UPDATEs.

    DELETE PrismData.dbo.tblCall

    FROM PrismData.dbo.tblCall

    INNER JOIN PrismDataArchive.dbo.tblCallArchive ON Call_Prod.Call_ID = arc.Call_ID

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Welsh Corgi (12/28/2015)


    Welsh Corgi (12/28/2015)


    I just need to convert the following into a Delete Statement for the records in Database PrismDsta.

    Thank you.

    Your code is very cleaver and the other post that were made.

    Thanks everyone.

    The following code is simplistic compared to the post that were made.

    I appreciate it very much.

    Do you see a problem with the following?

    BEGIN TRANSACTION

    DELETE Call_Prod

    FROM PrismData.dbo.tblCall AS Call_Prod

    INNER JOIN PrismDataArchive.dbo.tblCallArchive AS arc ON Call_Prod.Call_ID = arc.Call_ID

    -- ROLLBACK TRANSACTION

    -- COMMIT TRANSACTION

    The delete syntax looks ok as long as PrismData and PrismDataArchive databases are on the same server. What syntax error are you getting?

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply