best way to copy active table

  • Greg. It's quite irrelevent where in the column order the primary key is. I assume that your primary key is clustered (since it's the default)

    The data will be stored on disk in the order of the cluster key, regadless of what position that column has in the table.

    There's no need to reorder the columns (and no real way to do it. Enterprise manager completely recreates the table when you reorder columns)

    The order of the columns doesn't matter to SQL and it shouldn't matter to users (ie don't use select *)

    On disk, there's no guarentee that the data is stored within each row in the same order as the columns are defined. That's one of the underlying principles of relational databases - the user of the data (or programmer) does not have to know anything about the underlying physical structure.

    If you are interested, look at the topic "Table and Index Architecture" under SQL Server Architecture in Books Online

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • If I'm going to do an insert across linked servers, it looks like this

    Insert into LocalDB.dbo.tbl (<fieldlist&gt

    SELECT <fieldlist> FROM RemoteServer.RemoteDB.dbo.OtherTbl

    Be very careful of select... into especially with large volumes as some of the system tables in the destination db will be locked for the duration of the insert.

    Run these 2 and compare the locks that are held in db 2 (The transactions are only there so that the locks get held long enough to be visible with sp_lock. There's a transaction around every command because that is how things would be in normal autocommit mode)

    1) On my machine takes 55 locks, 50 in TempDB

    BEGIN TRANSACTION

     SELECT 1 AS a, 'abc' as b INTO #Test

     exec sp_lock

    COMMIT TRANSACTION

    DROP TABLE #Test

    2) On my machine create table takes 16 locks, 11 of them in TempDB. Insert takes 5 locks, one of them in tempDB

    BEGIN TRANSACTION

     CREATE TABLE #Test (a tinyint, b char(3))

     exec sp_lock

    COMMIT TRANSACTION

    BEGIN TRANSACTION

     INSERT INTO #Test

     SELECT 1 AS a, 'abc' as b

     exec sp_lock

    COMMIT TRANSACTION

    DROP TABLE #Test

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It's looking like my best bet in this case is to re-create the table on the destination server WITH a PK identity(1,1) column and do an INSERT INTO .. SELECT .. WITH (NOLOCK). It's faster than the SELECT .. INTO and much faster than DTS (at least, the way I had it configured.. I tried a number of options).

    Results:

    INSERT INTO SELECT ..: 09 seconds

    SELECT ... INTO, then ADD PK id col: 27 seconds

    DTS: around 10 minutes

  • Like I said earlier... the hard knocks code will beat the pants off DTS either way.  Thanks for the feedback Greg.  You can get some extra performance out of SELECT INTO if you set the SELECT INTO/BULKCOPY option to ON and meet a few other requirements that are listed in BOL.  Still, the addition of a PK will take a good amount of time especially if you are adding a clustered index.  You have prompted me to do some extreme load testing within and across servers to compare INSERT/SELECT with a PK (both clustered and nonclustered) as compared to SELECT INTO (in and out of the bulk copy mode) and then adding each type of primary key.

    Gail, thanks for the test code.  I did see the locks on SysObjects, SysIndexes, and SysColumns but those locks just don't make a difference and will not interfere with the operation of the server... if the did, you wouldn't be able to create another temp table in another window of query analyzer... try this...

    Run this from one window in QA...

    BEGIN TRANSACTION

     SELECT 1 AS a, 'abc' as b INTO #Test

     exec sp_lock

    Since there is no commit, it will hold the locks on the system tables forever if we keep the window open.  So keep the window open for the rest of this test... we want to run nearly identical code in another QA window... if the locks matter, the following code should not run in a second window until we commit in the first window (do note that the table name has been changed a tiny bit)...

    BEGIN TRANSACTION

     SELECT 1 AS a, 'abc' as b INTO #Test1

     exec sp_lock

    When we execute the above code in a second window, the code runs successfully despite the locks being held by the first query in the first window.  The reason why is that the optimizer is smart enough to lock different extents in the systems tables for each temp table (or real table if you want) so although I agree the locks exist, you don't have to be careful with using SELECT...INTO because the system is mighty smart about how it does it.

    For copying to a new permanent table, the locks still won't matter if the table names are different.  And, it won't be the locks that kill the second process if the names are the same... it will be an "Object already exists" error, instead.

    The real key is that, as Greg's test shows, both the Insert/Select and the Select/Into just beat the living tar out of DTS.  The "hard knocks" code wins either way.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The reason why is that the optimizer is smart enough to lock different extents in the systems tables for each temp table (or real table if you want) so although I agree the locks exist, you don't have to be careful with using SELECT...INTO because the system is mighty smart about how it does it.

    How many extents do you have in sysobjects?

    Maybe it's not important to you, but I try very hard to have as few locks held and for as short a time as possible. (a quick look at my prod server tells me there are currently 1620 locks held. Come midmorning, it'll probably be 2-5 times that)

    I'm glad that you 'hard-knocks code' has always worked for you. I'm just saying don't discount DTS as always slower.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Sysobjects will have new pages and extents for each new table.

    And, I have to discount DTS as always slower... I've always been able to beat it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 6 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply