How do I speed up big inserts

  • I receive a daily text file with up to 8 millions rows in it.

    I bcp it into MyFeedTable.

    I perform business logic on that MyFeedTable to clean it up then I pass the contents to CleanDataTable which grows daily.

    I use this:

    insert into CleanDataTable

    select * from MyFeedTable

    This last insert takes a very long time and seems to be getting longer every day as CleanDataTable grows.

    My indexes look okay and appropriate. I've read that the whole insert will be logged in the trans log before being committed, so this clearly takes up some of the time. I've also read that you can't turn off logging.

    Anybody got any tricks?

    thanks,

    Bruce

  • You might want to try putting your transaction logs on a different disk set to your data files. It takes the pressure off, the writing to file and writing to log will be done in parallel....

    Meg

  • Meghardy is right..you can try putting the Datafiles and logfiles on different Disk Sets. if you have configured raids place them under 2 different raid controllers which will give you the best performance. Also try making a filegroup where you can place your index

    levi


    Jesus My Saviour

  • Consider using bulk insert. You'd have to time it, but in some cases it may make sense to drop your indexes and add them back afterward.

    Andy

  • Also make sure when you add them back add the clustered first then non-clustered. Reason is every time a clustered index is dropped and added all existing non-clustered indexes are automatically rebuilt. Also agree with Andy look at bulk insert solution to avoid the log file. Note thou if using SQL 2000 be carefull of your recovery model as from BOL

    quote:


    RECOVERY FULL | BULK_LOGGED | SIMPLE

    When FULL is specified, database backups and transaction log backups are used to provide full recoverability from media failure. All operations, including bulk operations such as SELECT INTO, CREATE INDEX, and bulk loading data, are fully logged. For more information, see Full Recovery.

    When BULK_LOGGED is specified, logging for all SELECT INTO, CREATE INDEX, and bulk loading data operations is minimal and therefore requires less log space. In exchange for better performance and less log space usage, the risk of exposure to loss is greater than with full recovery. For more information, see Bulk-Logged Recovery.

    When SIMPLE is specified, the database can be recovered only to the last full database backup or last differential backup. For more information, see Simple Recovery.

    SIMPLE is the default setting for SQL Server Personal Edition and the desktop engine, and FULL is the default for all other editions.

    The status of this option can be determined by examining the Recovery property of the DATABASEPROPERTYEX function.


    "Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

  • Thanks everybody!

    These are all great tips.

    Yes, we have a RAID set up but its configuration is out of my control. I'm hoping it's optimized.

    I'll do some experimenting with the Bulk inserts and drop/recreate indexes.

  • Check performance monitor for disk queue length ,total processor and buffer cache

    If disk queue length is more than 2 per disk in array then Disk I/O is culprit

    If total processor is more that 80% for the insert time then processor is culprit

    if buffer cache ratio is less then 90% then memory is culprit

    Prakash

    Prakash Heda
    Lead DBA Team - www.sqlfeatures.com
    Video sessions on Performance Tuning and SQL 2012 HA

  • In general those are good points, but in this case you probably don't want to size your system based on a single daily task. If you're consistently at those values, then it's time to perf tune and/or upgrade the hardware.

    Andy

  • Very good post.

    Give me a fish, you feed me for a day! Teach me to fish, you feed me for a life time.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply