TSQL query help needed

Question

TSQL query help needed

vsamantha35

SSChampion

Points: 11551
More actions
December 25, 2016 at 1:26 pm

#338973

Hi Team,
I have a 2 tables with exactly same structure ( i.e. same columns and datatypes ) but difference is, Source table has 100 million rows and destination has 0 rows.
One more thing to mention is, both do not have primary key columns and may also have some duplicate data in Source table.
So, what I am looking for is, I need to copy rows from source tbl to destination tbl in small chunks ( avoid large txn which may result in large transaction log file)
and at the same time, I need to delete the rows from source table. Again, there is no primary key column in source table to tell to which point the rows have been copied over to
destinaion table.
I tried to copy rows as below and wanted to truncate source table once all rows are copied but due to space constraint, I am getting below error
I can add more disk space but not at this point. So, look for tsql code which takes care of INSERT into destionation table and DELETE from source table.
Can anyone help me on how to put the DELETE logic in the below batch??? any row_number() has to be used?
/*
Could not allocate space for object 'destination_tbl' in database 'testdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.
*/
go
--My piece of code
DECLARE @BatchSize INT = 1000
WHILE 1 = 1
BEGIN
INSERT INTO destionation_tbl
SELECT TOP(@BatchSize) * FROM source_tbl
IF @@ROWCOUNT < @BatchSize
BREAK
END
-- insert 50mb worth data
-- page count for a table
create table source_tbl
(c1 int identity,
c2 char (8000) default 'a' --- inserting 8k page
);
insert into source_tbl default values
go 6400
create table destination_tbl
(c1 int,
c2 char (8000)
);
Thanks,
Sam

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply

pietlinden SSC Guru Points: 63618 More actions · Answer 1

It seems you are misunderstanding what SQL Server does with empty space in a table... It does nothing unless you tell it to.

Executing DELETE statements against a table does not make the table consume less hard drive space. SHRINK does that, but it fragments your indexes. What you probably need is another filegroup on another drive, and then you can deal with the problem properly.

Jeff Moden SSC Guru Points: 1004664 More actions · Answer 2

Additionally, DELETEs are fully logged no matter which Recovery Model you may be using. Sure, the SIMPLE recovery model removes the information from the Transaction Log File once the transaction is successfully committed but it's still going to write whatever batch size you use and that also means it has to do two writes for each row... one to the table and one to the Transaction Log File.

My recommendation would be to study up on how to use "Minimal Logging".

You also stated that there's no PK on the source but still want to eliminate copying duplicates. How do you define what a "duplicate" actually is? Whole row only or is/are there a column/columns that can be used for the sake of identifying duplicates.

Also, the method you posted is almost guaranteed to make a silent error because you simply cannot count on the supposed "natural" order in a table. You MUST have something unique you can sort on to guarantee the sort order of both the insert and the delete that you want to do.

Let us know how you intend to determine what a duplicate actually is.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

drew.allen SSC Guru Points: 77014 More actions · Answer 3

If the two tables are exactly the same, couldn't you just drop the empty table and rename the remaining table? Of course, that wouldn't get rid of your duplicates.

Another option would be to create a SYNONYM rather than moving the rows.

If you have to move the rows, I would first get rid of the duplicates and then use the OUTPUT clause to DELETE from one table and INSERT into another. Something like the following:

DELETE TOP(@batch_size)

OUTPUT DELETED.* INTO table_b

FROM table_a

Drew

J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA