Copy varchar(max) fields slow

  • Hi

    I'm archiving some data. In a 2 step process.

    1. Copy old data from each table in LiveDB to same table in ArchiveDB.

    2. Delete the data from each table in LiveDB which is in ArchiveDB

    Both DBs SIMPLE recovery mode.

    Each table has a clustered PK on a single int value. In both DBs

    The tables with varchar(max) columns are taking a v.long time to copy over.

    IS there anything I can change in the ArchiveDB to make it run faster.

    It is the insert that is taking the time. I've tried dropping the clustered PKs in ArchiveDB tables and then rebuilding afterwards but it has not made any difference. After all I am adding data to the ArchiveDB in clustered index order, so wouldn't have expected it to.

    Any tips, I can change the Archive DB but cannot touch the schema/settings of Live DB.

    Thanks

  • terry999 (3/13/2015)


    Hi

    I'm archiving some data. In a 2 step process.

    1. Copy old data from each table in LiveDB to same table in ArchiveDB.

    2. Delete the data from each table in LiveDB which is in ArchiveDB

    Both DBs SIMPLE recovery mode.

    Each table has a clustered PK on a single int value. In both DBs

    The tables with varchar(max) columns are taking a v.long time to copy over.

    IS there anything I can change in the ArchiveDB to make it run faster.

    It is the insert that is taking the time. I've tried dropping the clustered PKs in ArchiveDB tables and then rebuilding afterwards but it has not made any difference. After all I am adding data to the ArchiveDB in clustered index order, so wouldn't have expected it to.

    Any tips, I can change the Archive DB but cannot touch the schema/settings of Live DB.

    Thanks

    How much data are we talking about here? Do the varchar(max) columns have a lot of data too? Since this is just data being archived why such a concern about speed? You could do this in batches which would ease the pressure on the transaction log which will likely increase performance.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Hi Sean thanks for replying.

    It's an express edition that is getting close to it's individual DB size limit. The varchar(max) field mostly contain data and char len > 15,000 is common.

    I have split the large tables into batches.

    I was hoping there might be some setting for the ArchiveDB, which would improve insert performance but degrade select performance. The archive is just for reference.

  • Sorry I forgot, total data movement is about 5GB 90% of this 5GB are 2 tables with the varchar(max) columns.

    It takes about >95% of the time to move this data, haven't deleted it yet! again will delete in batches.

    My machine is slow, but still takes an hour to move the Data.

  • terry999 (3/13/2015)


    Sorry I forgot, total data movement is about 5GB 90% of this 5GB are 2 tables with the varchar(max) columns.

    It takes about >95% of the time to move this data, haven't deleted it yet! again will delete in batches.

    My machine is slow, but still takes an hour to move the Data.

    Move it in batches. That is 5gb+ in your transaction log if you are doing it all at once.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • What Sean said, but also, if you need to move large amount, pregrow your log file so it doens't need to happen during the copy.

  • Thanks for replying.

    I size the Archive DB to 5GB to start with and give it a large tran log.

    I notice even though I'm doing it in batches;

    1. The tran log of the DB having data inserted is ok, so doesn't need to grow

    2. The tran log of the DB that is being deleted from grows by the 5GB (approx), even though I split the deletes of the big tables (>90% data) into batches. I can see it is doing the deletes as batches (print statements)

    Should I be issuing manual CHECKPOINTS?

    Thanks Terry

  • What are the initial size and growth settings for the DB and the related log file?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff

    I set the archive DB to be 5GB to start off with, it doesn't grow, it's tran log to 1 GB it also does not need to grow.

    The Live DB is problematic (log 10% auto grow) starting at 10MB.

    I plan to increase this before archiving I thought 1GB would be fine for each table/batch After all the Archive DB tran Log copes with this. However the Live DB (the one with Deletes happening) goes to > 5GB!!

    I must have something wrong with my batch processing on deletes I will look at it tomorrow.

    Thanks for your posts. I'd concentrated on the ArchiveDB log etc which I created, forgot about the LiveDB.

  • Thanks, Terry. Sounds like you've done it right on the archive side.

    On the DELETEs, remember that they affect ALL indexes and ALL FKs pointing at the table. If there are any INDEXED VIEWS, those will be affected, as well. We have one super-wide (137 column) table at work (not my doing) that takes nearly 60 seconds to delete a single row from because of all that, not to mention that it has more than 1 blob datatype in it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff,

    My fault I was doing

    use ArchiveDB

    GO

    DELETE FROM LiveDB.dbo.tblCust

    WHERE EXISTS (

    SELECT TOP (100000) tArchive.CustID FROM dbo.a_tblCust as tArchive

    WHERE LiveDB.dbo.tblCust.custID = tArchive.custID

    )

    This worked correctly

    DELETE TOP (100000) LiveDB.dbo.tblCust

    FROM a_tblCust as tArchive

    WHERE LiveDB.dbo.tblCust.custid= tArchive.custid

    I'm joining archive to live DBs, cross DB join on 2 clustered indices i.e. PKs on CustID as integer.

    I'm doing this N times where N is the no of batches.

    If I were to select just CustIDs into a #Temp table, is it quicker joining to TempDB rather than To LiveDB. I wouldn't think so, it is still a cross DB join. Or is there something special about TempDB. Just curious

    Thanks Terry

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply