Bulk copy from one database to another

  • Experienced developer but pretty new to SQL Server...

    About 8 times a year (roughly twice a quarter), I need to copy a subset of tables/columns from one database to another. This is due to data access security requirements over which I have no control (can't turn on cross database access). The copy involves ~ 1 billion records across 10 tables.

    Most of the source tables have no relationship (no FKs); 4 have FK's to another table.

    I've created indexes on the target tables to support the queries required by the application.

    My first attempt (TRUNCATE TABLE, INSERT INTO) filled the transaction logs. After a bit of Googling, I've written a Powershell script that uses the .Net Data.SqlClient.SqlBulkCopy class to copy the data from source to target database.

    When I tested the performance of this script copying to target heap tables, the results were great. Now that I've added the indexes, not so good.

    Any advice for improving the performance? Should I disable FK's, drop the indexes, truncate the tables, run the script, recreate the indexes, reenable the FK's? Is the *overall* processing time improved by first dropping the indexes, vs. just truncating the table and letting the indexes get recreated during the bulk copy?

    From my Googling, it doesn't appear that I can programmatically capture the current indexes and recreate them later. A pity, but this can be easily scripted.

    I just tested - it appears I have the access rights to change the target database logging mode.

    Any and all advice appreciated.

  • Looks like this should do it...

    http://stackoverflow.com/questions/11301383/automatically-drop-and-recreate-current-indexes

    EXEC sp_MSforEachTable 'ALTER INDEX ALL ON ? DISABLE'

    and

    EXEC sp_MSforEachTable 'ALTER INDEX ALL ON ? REBUILD'

    is all you need if you want to do it for all tables and every index.

  • pietlinden (9/4/2016)


    Looks like this should do it...

    http://stackoverflow.com/questions/11301383/automatically-drop-and-recreate-current-indexes

    EXEC sp_MSforEachTable 'ALTER INDEX ALL ON ? DISABLE'

    and

    EXEC sp_MSforEachTable 'ALTER INDEX ALL ON ? REBUILD'

    is all you need if you want to do it for all tables and every index.

    This looked promising, but when I do this I get an error:

    ALTER INDEX ALL ON content.ISC_EPISODE_DRG DISABLE

    INSERT INTO content.ISC_EPISODE_DRG

    (

    [facility_identifier]

    ,[stay_number_e]

    ,[ar_drg]

    ,[ar_drg_version]

    ,[mdc]

    ,[snap_from_date]

    ,[replica_valid_from_date]

    ,[replica_valid_to_date]

    ,[replica_current_record]

    )

    SELECT [facility_identifier]

    ,HASHBYTES('SHA1',[stay_number]) AS stay_number_e

    ,[ar_drg]

    ,[ar_drg_version]

    ,[mdc]

    ,[snap_from_date]

    ,[replica_valid_from_date]

    ,[replica_valid_to_date]

    ,[replica_current_record]

    FROM hierep_prod.dbo.ISC_EPISODE_DRG

    Returns:

    Msg 8655, Level 16, State 1, Line 3

    The query processor is unable to produce a plan because the index 'IXCN_ISC_EPISODE_DRG_NaturalKey' on table or view 'ISC_EPISODE_DRG' is disabled.

    The index listed is the clustered index.

  • I've searched some more, and now realize that I can't disable the clustered index and have bulk inserts work.

    What I really need is a way to drop all indexes (convert the table to a heap), then recreate the indexes after the bulk inserts.

    Does anyone have a script or SP that does this? This looks close, but isn't quite what I need:

    https://www.mssqltips.com/sqlservertip/3441/script-out-all-sql-server-indexes-in-a-database-using-tsql/.

    Or should I just RMB the database --> Tasks --> Generate Scripts, choose the tables I want, then break the results into drop indexes/created indexes?

    Thanks...

  • Scott In Sydney (9/7/2016)


    I've searched some more, and now realize that I can't disable the clustered index and have bulk inserts work.

    What I really need is a way to drop all indexes (convert the table to a heap), then recreate the indexes after the bulk inserts.

    Does anyone have a script or SP that does this? This looks close, but isn't quite what I need:

    https://www.mssqltips.com/sqlservertip/3441/script-out-all-sql-server-indexes-in-a-database-using-tsql/.

    Or should I just RMB the database --> Tasks --> Generate Scripts, choose the tables I want, then break the results into drop indexes/created indexes?

    Thanks...

    Why not just bulk load into a staging table then index as needed then drop main-table and rename the staging table?

  • I have used SMO usin .net and SSIS long back to automate scripting selective database objects.

    I think you can use SMO and powershell to achieve the same

    ----------------------------------------------------------------------------------------------------------------------------------------------------
    Roshan Joe

    Jeff Moden -Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • Smendle (9/7/2016)


    Why not just bulk load into a staging table then index as needed then drop main-table and rename the staging table?

    I've written a Powershell script which does the following (among other things):

    1) Clones the source table as:

    SELECT <column list> INTO TgtDatbase.schema.zzz_<TgtTable> FROM Srcdatabase.schema.table WHERE 0=1

    2) Bulk copies the above query from source to zzz_<TgtTable>.

    3) Sets PK and Indexes on zzz_<TgtTable> (separate script to the bulk copy)

    4) Drops the target table and renames the staging table as:

    exec sp_rename 'TgtDatabase.schema.zzz_<TgtTable>' 'TgtTable'

    Problems/Questions:

    A) Is there a way I can programmatically capture the PK, FK, Indexes, etc. from the target table, and then programmatically recreate those items on zzz_table after the bulk copy? Right now all that post-processing in step 3 is hardcoded. And if so, the best approach: stored procedure, pull metadata into Powershell, some other approach?

    B) I'm creating the indexes named as IXNN_TABLENAME_ColumnName. However, this results in a name collision with the existing target table. I prefer not to delete the target table until step 4 so the users can continue to work during the bulk copy. I could create the indexes as IXNN_zzz_TABLENAME_ColumnName. But is there a way I can programatically rename the PK, FK, Indexes, etc when I rename the table?

    C) In general this processing will run overnight. However, if it is long running, it could still be running in the morning. I assume if someone is querying the target table when step 4 is executing, that step will fail. Is there a way to prevent that, say a long timeout value for the table rename step?

    Thanks,

    Scott

    P.S.: I'm sure the PS script could be improved. If anyone is interested in the script, either to critique or use for themselves, let me know and I'll upload to GitHub.

  • A) I've got this working. I couldn't work out how to dynamically build CREATE INDEX code from an existing table, so this is hard coded in external .sql files.

    B) I've got this working.

    C) Low priority...if you can comment then cool.

    My PS script is here: https://github.com/scottbass/Powershell/blob/master/Functions/SqlServerBulkCopy.ps1. Comments/improvements welcome.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply