TSQL query help needed

  • Hi Team,

    I have a 2 tables with exactly same structure ( i.e. same columns and datatypes ) but difference is, Source table has 100 million rows and destination has 0 rows.

    One more thing to mention is, both do not have primary key columns and may also have some duplicate data in Source table.

    So, what I am looking for is, I need to copy rows from source tbl to destination tbl in small chunks ( avoid large txn which may result in large transaction log file)

    and at the same time, I need to delete the rows from source table. Again, there is no primary key column in source table to tell to which point the rows have been copied over to

    destinaion table.

    I tried to copy rows as below and wanted to truncate source table once all rows are copied but due to space constraint, I am getting below error

    I can add more disk space but not at this point. So, look for tsql code which takes care of INSERT into destionation table and DELETE from source table.

    Can anyone help me on how to put the DELETE logic in the below batch??? any row_number() has to be used?

    /*

    Could not allocate space for object 'destination_tbl' in database 'testdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.

    */

    go

    --My piece of code

    DECLARE @BatchSize INT = 1000

    WHILE 1 = 1

    BEGIN

    INSERT INTO destionation_tbl

    SELECT TOP(@BatchSize) * FROM source_tbl

    IF @@ROWCOUNT < @BatchSize

    BREAK

    END

    -- insert 50mb worth data

    -- page count for a table

    create table source_tbl

    (c1 int identity,

    c2 char (8000) default 'a' --- inserting 8k page

    );

    insert into source_tbl default values

    go 6400

    create table destination_tbl

    (c1 int,

    c2 char (8000)

    );

    Thanks,

    Sam

  • It seems you are misunderstanding what SQL Server does with empty space in a table... It does nothing unless you tell it to.

    Executing DELETE statements against a table does not make the table consume less hard drive space. SHRINK does that, but it fragments your indexes. What you probably need is another filegroup on another drive, and then you can deal with the problem properly.

  • Additionally, DELETEs are fully logged no matter which Recovery Model you may be using. Sure, the SIMPLE recovery model removes the information from the Transaction Log File once the transaction is successfully committed but it's still going to write whatever batch size you use and that also means it has to do two writes for each row... one to the table and one to the Transaction Log File.

    My recommendation would be to study up on how to use "Minimal Logging".

    You also stated that there's no PK on the source but still want to eliminate copying duplicates. How do you define what a "duplicate" actually is? Whole row only or is/are there a column/columns that can be used for the sake of identifying duplicates.

    Also, the method you posted is almost guaranteed to make a silent error because you simply cannot count on the supposed "natural" order in a table. You MUST have something unique you can sort on to guarantee the sort order of both the insert and the delete that you want to do.

    Let us know how you intend to determine what a duplicate actually is.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • If the two tables are exactly the same, couldn't you just drop the empty table and rename the remaining table? Of course, that wouldn't get rid of your duplicates.

    Another option would be to create a SYNONYM rather than moving the rows.

    If you have to move the rows, I would first get rid of the duplicates and then use the OUTPUT clause to DELETE from one table and INSERT into another. Something like the following:

    DELETE TOP(@batch_size)

    OUTPUT DELETED.* INTO table_b

    FROM table_a

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply