Bulk Delete & Insert

  • Hi Experts,

    We have a scenario in which application scheduler inserts processed data (Approx 5million) into a table(ABC). This table is truncated before inserting the data. The scheduler takes around 3 hours to process and insert data which is causing data unavailability to many users .

    We have come up with a solution in which the processed data is inserted into a new table (this table done have any index) ,once the insert completes on this new table, the customer table (ABC) is truncated and the data is inserted into it.

    Hope the new solution will help data available for users and also reduce time taken for insertion. Experts please let know your thoughts and also provide better solutions.

    Thanks in Advance.

  • Does the table have a FOR INSERT trigger on it? If so, the trigger has to fire for every row. Disable the trigger while the load is being done.

    Are you shrinking the table after the truncate? If so, then the table needs to auto-grow and that will take some time.

    Do you have an index on the table? If so, try removing the index, loading the data, then re-applying the index.

    -SQLBill

  • I don't understand from your explanation what that other table is doing for you.

    Adding to what Bill said - why not remove all indexes and constraints from the table ABC, insert the data directly into table ABC then re-add the index(es) and constraints.

    It is good to understand what is causing the insert to be slow too. For example - if you are doing something like:

    INSERT INTO ABC

    SELECT <... subquery>.

    This could slow you down if the SELECT statement that drives the insert has issues.

    Lastly, is it unavoidable that this is happening at a time when users need access to the data? Is it possible to schedule the job to run earlier or later when less users are working with the data?

    Those are my thoughts.

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • .

  • Thanks Alan and Bill,

    The tables are accessed through out the day and this process is happening twice a day.

    Which one is fast? Processing data and inserting it(currently taking 3 hours) or inserting the processed data from one table to another?

  • Depends. This is where you need to set up a test database/tables and try all your methods out. You can us smaller amounts of data for testing, but you still need enough to show the difference.

    From our end, we can't tell if the bottle neck is in the processing of the data or with the inserting of the data...so we really can't say which will be faster.

    Once you narrow it down to where the actual bottleneck is, then you can work on resolving that issue.

    -SQLBill

  • Ratheesh.K.Nair (5/13/2015)


    Hi Experts,

    We have a scenario in which application scheduler inserts processed data (Approx 5million) into a table(ABC). This table is truncated before inserting the data. The scheduler takes around 3 hours to process and insert data which is causing data unavailability to many users .

    We have come up with a solution in which the processed data is inserted into a new table (this table done have any index) ,once the insert completes on this new table, the customer table (ABC) is truncated and the data is inserted into it.

    Hope the new solution will help data available for users and also reduce time taken for insertion. Experts please let know your thoughts and also provide better solutions.

    Thanks in Advance.

    Mostly a total waste of time because you've only solved half the problem.

    There are two really fast ways to do this with virtually no downtime. This is one of my favorites...

    Have two identical tables with the correct indexes, constraints, etc. One is offline and a synonym is point to the other making it the online table. Truncate the offline table, repopulate it, rebuild indexes if needed, and anything else that needs to be done. When complete, drop and rebuild the synonym to point at the newly populated table. Total downtime will likely be less than a millisecond.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply