Effect of auto statistics on performance of large INSERT/DELETE

  • We're doing large deletes and inserts as part of a data warehouse load. We're looking to optimize this as users have demanded mid-day loads that are currently taking 1-2 hours.

    The general process for each fact table is:

    1. Drop non-clustered indexes.

    2. Large delete of data we will be importing (uses clustered index to determine what to drop).

    3. Drop clustered index.

    4. BULK INSERT data.

    5. Rebuild clustered index.

    6. Rebuild non-clustered indexes.

    When all tables are done, then

    7. UPDATE STATISTICS on all tables

    The delete and insert steps are still slowest, followed by the non-clustered index rebuilds. Since I'm on 2000 EE, the non-clustered indexes happen in parallel on an 8-way machine with 4 Gig, so they're relatively fast.

    I'm wondering whether auto-statistics could be slowing the delete and/or insert steps. The Fact tables don't change except for batch loads, and there's always an index re-create then.

    Should I turn off auto-statistics for these tables, and use the NORECOMPUTE option on the UPDATE STATISTICS statement?

    Is the UPDATE STATISTICS needed at all, since all the indexes are re-created as part of the batch load? In other words, does CREATE INDEX always update statistics also? BOL doesn't seem to answer this clearly. If so, the previous question applies to the NORECOMPUTE option on the CREATE INDEX statement.

    Finally, does the presence of a _Wa_Sys auto-column statistiscs index affect whether BULK INSERT can use a non-logged insert? I think they don't, since the _Wa_Sys indexes aren't really indexes but rather statistics holders, but I want to be sure on that point.

    Thanks for any insight you can lend.

    Vince

  • It's probable for a data warehouse that the auto stats are not required, as long as you manually update stats after a load.

    Placing the database in simple recovery mode should speed up inserts and deletes during the data load. However, maybe your data array(s) need more spindles as this will affect speed.

    The auto stats shouldn't imapct anything as it's the compiler that uses them.

    As I understand it a create index does the stats too, but I usually do the stats just to make sure after a data load.

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

  • Colin,

    Thanks for the reply. We're already in simple recovery mode. We may go with manual stats now.

    Spindles - could be an issue, but I'm not dealing with hardware as much as I'd like to. Different union. May need to push it but I want to exhaust software config first.

    Still looking for answers on whether the _Wa_Sys indexes cause a slow bulk load.

    Vince

  • No I've never known the WA_xxxx stats to have any impact ( assuming auto-create and auto_update stats are off )  I sometimes look to the wa_xxx to show me where there are missing indexes.

    I was advised that auto create stats are good ( helps the optimiser ) - auto update maybe not so good - I usually go for manual updates every night.

    My prefered route is to do all the work myself to make sure, it's been a solution with some of the DW loads where I work.  I usually do index builds, dbcc updateusage, sp_updatestats. ( or just the stats on the tables concerned ) You might want to check your processes during load for cxpacket locks - they are often a sign of out of date stats ( amongst other things )

    [font="Comic Sans MS"]The GrumpyOldDBA[/font]
    www.grumpyolddba.co.uk
    http://sqlblogcasts.com/blogs/grumpyolddba/

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply