insert in batch question

  • Can I do batch insert or update with a set of rows at a time(does not have to be in order of any kind), and have it automatically go to the next set of rows that have not been inserted or updated? If so, how can I do it?

    thanks for any help.

    Below is my script. The insert is for several millions records, and so I

    break it down by filenumber but it is still big for insert,

    so I try to do it in a small set.

    insert into table2 (col1, col2, col3)

    select table1.col1, table1.col2, table3.col3

    from table1 inner join table3

    on table1.col1 = table3.col1

    where table1.filenumber = 1

    go

    insert into table2 (col1, col2, col3)

    select table1.col1, table1.col2, table3.col3

    from table1 inner join table3

    on table1.col1 = table3.col1

    where table1.filenumber = 2

    go

  • sqlblue (1/5/2009)


    Can I do batch insert or update with a set of rows at a time(does not have to be in order of any kind), and have it automatically go to the next set of rows that have not been inserted or updated? If so, how can I do it?

    thanks for any help.

    Below is my script. The insert is for several millions records, and so I

    break it down by filenumber but it is still big for insert,

    so I try to do it in a small set.

    insert into table2 (col1, col2, col3)

    select table1.col1, table1.col2, table3.col3

    from table1 inner join table3

    on table1.col1 = table3.col1

    where table1.filenumber = 1

    go

    insert into table2 (col1, col2, col3)

    select table1.col1, table1.col2, table3.col3

    from table1 inner join table3

    on table1.col1 = table3.col1

    where table1.filenumber = 2

    go

    yes you can do batched inserts like that... just make sure you have enough log space...

    If table2 (destination) is empty you may want to do all at once using SELECT INTO ... which will be minimally logged.


    * Noel

  • Thanks Noel, but I wanted to be able to do just a set of row, say 10,000 or

    100,000 at a time. How do i do it so that the query will continue to the next

    10K or 100k records, without using cursor. if i do rowcount or top of 10,0000, it would only running once for only those records that set, right?

  • sqlblue (1/5/2009)


    Thanks Noel, but I wanted to be able to do just a set of row, say 10,000 or

    100,000 at a time. How do i do it so that the query will continue to the next

    10K or 100k records, without using cursor. if i do rowcount or top of 10,0000, it would only running once for only those records that set, right?

    If all you are doing is copying and not deleting from the source

    you will need a marker on the destination table... and a "while" loop

    SET @x = @x -- @@rowcount = 1

    while @@rowcount > 0

    begin

    insert into dest_table ...

    select top (100000) ...

    from src_query

    WHERE src_query.pk > dest_table.pk

    order by xyz

    end

    ... Keep in mind that you will have take transaction log backups regularly while this thing goes on...


    * Noel

  • You could apply an identity column if there isn't any.

    ALTER TABLE table1 ADD id_temp INT NOT NULL IDENTITY(1,1)

    -- This adds an identity column (id_temp) temporarily. You can remove the column later after your inserts complete.

    DECLARE @batch_sizeINT,

    @row_counterINT,

    @maxrow_countINT;

    SET @batch_size= 5000; -- batch of 5000

    SET@row_counter= 1;

    SELECT @maxrow_count = max(id_temp) FROM table1

    WHILE @row_counter <= @maxrow_count

    BEGIN

    INSERT INTO table2 (col1)

    SELECT col1

    FROM table1

    WHERE 1 = 1

    AND id_temp between @row_counter and (@row_counter + @batch_size)

    -- Set the @row_counter to the next batch start

    SET @row_counter = @row_counter + @batch_size + 1;

    END

    You can now drop the identity column (id_temp) from table1.

  • i see, i will try both of your method and see if it would work for me. But for

    Noel's suggestion, what is @x for in your query? Thanks a lot both for helping.

  • sqlblue (1/5/2009)


    Thanks Noel, but I wanted to be able to do just a set of row, say 10,000 or

    100,000 at a time. How do i do it so that the query will continue to the next

    10K or 100k records, without using cursor. if i do rowcount or top of 10,0000, it would only running once for only those records that set, right?

    It might be worth your while using a separate batch control or "tracker" table:

    SELECT IDENTITY (INT, 1, 1) AS RowID, Table1PK

    INTO #BatchControl

    FROM table1

    INNER JOIN table3

    ON table1.col1 = table3.col1

    DECLARE @Batchsize

    SET @Batchsize = 10000

    -- WHILE there are rows left

    INSERT INTO table2 (col1, col2, col3)

    SELECT table1.col1, table1.col2, table3.col3

    FROM table1

    INNER JOIN table3

    ON table1.col1 = table3.col1

    INNER JOIN (SELECT TOP (@Batchsize) Table1PK FROM #BatchControl ORDER BY RowID) b ON b.Table1PK = table1.Table1PK

    DELETE #BatchControl

    FROM (SELECT TOP (@Batchsize) RowID FROM #BatchControl ORDER BY RowID) b WHERE b.RowID = #BatchControl.RowID

    -- END

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • I would use something similar to Chris' SQL - but I would create #BatchControl manually with a clustered index on the RowID, then use a counting loop similar to gyesql's solution. I'd get the maximum using IDENT_CURRENT though, and not SELECT MAX().

    Atlantis Interactive - SQL Server Tools
    My blog[/url]
    Why I wrote a sql query analyzer clone

  • sqlblue (1/5/2009)


    Can I do batch insert or update with a set of rows at a time(does not have to be in order of any kind), and have it automatically go to the next set of rows that have not been inserted or updated? If so, how can I do it?

    thanks for any help.

    Below is my script. The insert is for several millions records, and so I

    break it down by filenumber but it is still big for insert,

    so I try to do it in a small set.

    insert into table2 (col1, col2, col3)

    select table1.col1, table1.col2, table3.col3

    from table1 inner join table3

    on table1.col1 = table3.col1

    where table1.filenumber = 1

    go

    insert into table2 (col1, col2, col3)

    select table1.col1, table1.col2, table3.col3

    from table1 inner join table3

    on table1.col1 = table3.col1

    where table1.filenumber = 2

    go

    "Break it down by file number..." So, why aren't you doing it that way? ๐Ÿ˜‰ Do a distinct on the filenumber and create a control table that has the file numbers in it... then, loop on that. The handy part is, any dates that are created on the way will all have the same date throughout the file.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply