SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Using INSERT INTO…SELECT to Bulk Load Data with Minimal Logging


Using INSERT INTO…SELECT to Bulk Load Data with Minimal Logging

Author
Message
Dave Mason
Dave Mason
Mr or Mrs. 500
Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)

Group: General Forum Members
Points: 538 Visits: 901
Please accept my apology if this post is in the wrong forum.

I'm trying to bulk load two tables with mimimal logging. I started by setting the recovery model to Bulk-logged. Here's the pseudo-code for each table:

TRUNCATE TABLE table_name

INSERT INTO table_name WITH(TABLOCK)
(Column_Names)
SELECT
(Column_Names)
FROM OPENROWSET (
BULK 'data_file_path',
FORMATFILE = 'format_file_path') alias
WHERE condition
ORDER BY column_with_nonclustered_index



I can backup the transaction log and shrink it down to 50 MB. After running the code for the two tables, the transaction log has grown to about 560 MB, which is about the same size as the primary data file (.mdf file).

Each table has exactly one index, which is non-clustered. (There are no primary keys, and no unique constraints.) I tried dropping the index before the insert statement, and then recreating the index after the insert statement as follows:

TRUNCATE TABLE table_name
DROP nonclustered_index

INSERT INTO table_name WITH(TABLOCK)
(Column_Names)
SELECT
(Column_Names)
FROM OPENROWSET (
BULK 'data_file_path',
FORMATFILE = 'format_file_path') alias
WHERE condition
ORDER BY column_with_nonclustered_index

CREATE nonclustered_index



Once again, I backup the transaction log and shrink it down to 50 MB. After running the code for the two tables, the transaction log stays at 50 MB.

In my mind, the 2nd scenario achieved my goal of bulk loading the two tables with minimal logging. But I don't know why I had to drop/create the indexes. I'd prefer not to do this. This documentation got me headed in the right direction:

http://msdn.microsoft.com/en-us/library/ms190422.aspx

But the question remains: If the tables are truncated before the bulk load, why do I have to drop/create indexes? (I am using SQL 2005, and the documentation is for SQL 2008.)

What am I missing?

Dave MasonSeminole County, FL
Dave Mason
Dave Mason
Mr or Mrs. 500
Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)Mr or Mrs. 500 (538 reputation)

Group: General Forum Members
Points: 538 Visits: 901
Followup...

I'm really reluctant to drop and recreate indexes. Let's say another dba or developer changes the index (add/remove columns to/from the index, change the FILLFACTOR, etc). My code would nullify those changes because I hard-coded the steps to create the index. Not good!

A better option is to disable the index and rebuild it later. (Thanks to Andy Warren for his article about Disabling Indexes!)

Here's my revised pseudo-code:

TRUNCATE TABLE table_name
ALTER INDEX nonclustered_index ON table_name DISABLE

INSERT INTO table_name WITH(TABLOCK)
(Column_Names)
SELECT
(Column_Names)
FROM OPENROWSET (
BULK 'data_file_path',
FORMATFILE = 'format_file_path') alias
WHERE condition
ORDER BY column_with_nonclustered_index

ALTER INDEX nonclustered_index ON table_name REBUILD



If I get really motivated, I can programatically determine the non-clustered indexes and disable/rebuild them.

Still, after reading the documentation on MSDN, I'm confused as to why any of this matters when the tables are truncated first...

Dave MasonSeminole County, FL
UMG Developer
UMG Developer
SSCrazy
SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)SSCrazy (2.7K reputation)

Group: General Forum Members
Points: 2714 Visits: 2204
Dave Mason (5/26/2010)

Still, after reading the documentation on MSDN, I'm confused as to why any of this matters when the tables are truncated first...


It's been a while, but in case you haven't figured it out, the reason that the log files stay so much smaller when you either DROP or DISABLE the non-clustered indexes is that if the indexes are active while you are loading data into the table SQL Server has to log all of the updates to the indexes. So as a record is inserted it has to log the page changes, and page splits. When you DROP or DISABLE the index first, and then CREATE or REBUILD it after the data is in, you are essentially creating a brand new index.

The two main good side-effects of doing this are that:
* the fragmentation in your indexes will be much less, making for better performance.
* It is normally faster to insert the data and then index. (It is faster to create the index all at once than to modify it row by row to keep it up to date.)

I hope that helps explain things.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search