Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

big table insert advice Expand / Collapse
Author
Message
Posted Sunday, December 23, 2012 3:10 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Wednesday, September 24, 2014 3:28 PM
Points: 1, Visits: 70
Hi I need to refresh data in a table from another table. It has about ~14 million records that need to be inserted.
(was planning to use import data task, there are no dependencies except for a view)

The destination table (identical to source table) has a primary key and clustered index. It also has few non clustered indexes.
Do I drop the indexes and the primary key and then insert data and add them back?

Or do I just directly copy data without dropping indexes. This seems to me to negatively impact performance. But, please let me know.
Post #1399745
Posted Sunday, December 23, 2012 8:41 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Wednesday, December 17, 2014 2:16 AM
Points: 2,840, Visits: 3,983
sqlsurfer11 (12/23/2012)
Do I drop the indexes and the primary key and then insert data and add them back? .
see i am not in favour of diasble the index (clus + non clus both ) , as you wil start the insert, the data will get sorted as per clus index and non clus index will also get organized acordingly. but one thing you need to assure that you should some mechanism for population like "batch insert approach" in this way you wil manage the load of resources particulary tlog.


-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done
Post #1399755
Posted Sunday, December 23, 2012 10:53 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: 2 days ago @ 11:21 AM
Points: 1,793, Visits: 5,044
sqlsurfer11 (12/23/2012)
Hi I need to refresh data in a table from another table. It has about ~14 million records that need to be inserted.
(was planning to use import data task, there are no dependencies except for a view)

The destination table (identical to source table) has a primary key and clustered index. It also has few non clustered indexes.
Do I drop the indexes and the primary key and then insert data and add them back?

Or do I just directly copy data without dropping indexes. This seems to me to negatively impact performance. But, please let me know.

When bulk inserting a large table, one potential performance hit will be index fragmentation, in both the target table's clustered key and perhaps also the non-clustered indexes. It depends on if the clustered key in the target table is sequentually incremented, like a date/time or identity. If they keys are inserted sequentially, there may be insignificant fragmentation. Run a query to check index fragmentation before and after an insert. If it's getting fragmented heavily, then it might make sense to drop indexes and re-create. It also depends on how often this table gets bulk inserted.
Post #1399757
Posted Monday, December 24, 2012 3:18 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Tuesday, December 16, 2014 1:17 AM
Points: 243, Visits: 311
Hi,
a while ago I had to create an environment copy for a large database at a client site.

I tried several methods to copy the large tables, but the one which provided the fastest copy was to drop the table and indexes - "select into" the add back all the indexes and stats.

A couple of caveats:
1) this was a copy to development!!!
2) The logging method was simple
3) I generated a script to run through the indexes and recreate them

I hope this gives you food for thought.

CHeers

Peter


I work for 1st Consulting, a small ISV based in the UK. We build web and desktop applications using the MSFT stack. Find us at www.1stconsulting.biz
Post #1399855
Posted Tuesday, December 25, 2012 7:47 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: 2 days ago @ 11:21 AM
Points: 1,793, Visits: 5,044
Peter Bannister (12/24/2012)
Hi,
a while ago I had to create an environment copy for a large database at a client site.

I tried several methods to copy the large tables, but the one which provided the fastest copy was to drop the table and indexes - "select into" the add back all the indexes and stats.

A couple of caveats:
1) this was a copy to development!!!
2) The logging method was simple
3) I generated a script to run through the indexes and recreate them

I hope this gives you food for thought.

CHeers

Peter

The reason why SELECT INTO <new table> is faster than INSERT INTO <existing table> is because SELECT INTO doesn't transaction log the data pages being inserted. In the event of rollback, SQL Server simply has to discard the table. However, when inserting into an existing table, SQL Server has to keep track of what pages are being modified to support a rollback.
Post #1400053
Posted Tuesday, December 25, 2012 11:07 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Wednesday, December 17, 2014 2:16 AM
Points: 2,840, Visits: 3,983
Eric M Russell (12/25/2012)
The reason why SELECT INTO <new table> is faster than INSERT INTO <existing table> is because SELECT INTO doesn't transaction log the data pages being inserted. In the event of rollback, SQL Server simply has to discard the table.
any article/reference for this ??


-------Bhuvnesh----------
I work only to learn Sql Server...though my company pays me for getting their stuff done
Post #1400147
Posted Wednesday, December 26, 2012 12:13 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, December 2, 2014 9:20 AM
Points: 1,372, Visits: 1,567
There are two ways (apart from BULK insert) which creates minimal transaction logs

1) SELECT INTO
2) using TABLOCK hint


~ Lokesh Vij

Guidelines for quicker answers on T-SQL question
Guidelines for answers on Performance questions

Link to my Blog Post --> www.SQLPathy.com

Follow me @Twitter

Post #1400158
Posted Wednesday, December 26, 2012 12:14 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, December 2, 2014 9:20 AM
Points: 1,372, Visits: 1,567
Bhuvnesh (12/25/2012)
Eric M Russell (12/25/2012)
The reason why SELECT INTO <new table> is faster than INSERT INTO <existing table> is because SELECT INTO doesn't transaction log the data pages being inserted. In the event of rollback, SQL Server simply has to discard the table.
any article/reference for this ??


Here is the link Bhuvnesh:
http://msdn.microsoft.com/en-us/library/ms191244(v=sql.105).aspx


~ Lokesh Vij

Guidelines for quicker answers on T-SQL question
Guidelines for answers on Performance questions

Link to my Blog Post --> www.SQLPathy.com

Follow me @Twitter

Post #1400159
Posted Wednesday, December 26, 2012 12:15 AM


Ten Centuries

Ten CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen CenturiesTen Centuries

Group: General Forum Members
Last Login: Tuesday, December 2, 2014 9:20 AM
Points: 1,372, Visits: 1,567
Eric M Russell (12/25/2012)

The reason why SELECT INTO <new table> is faster than INSERT INTO <existing table> is because SELECT INTO doesn't transaction log the data pages being inserted. In the event of rollback, SQL Server simply has to discard the table. However, when inserting into an existing table, SQL Server has to keep track of what pages are being modified to support a rollback.


Just a small correction, there is minimal logging which is done. It is not correct to say that "NO transaction log is created". You can use the below query to verify the transaction log which is created.

SELECT TOP 10 *
FROM Fn_dblog(NULL, NULL)
WHERE allocunitname = <Your_table_name>
ORDER BY [Log Record Length] DESC



~ Lokesh Vij

Guidelines for quicker answers on T-SQL question
Guidelines for answers on Performance questions

Link to my Blog Post --> www.SQLPathy.com

Follow me @Twitter

Post #1400161
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse