Update takes long time

  • We have to update table with 160 mln rows, this is 3 years data. In production it could be more. Now I'm trying to run following script, it fails after 3 days saying that it was timed out:

    update SalesDate

    set [Class]= i.class, Department=i.dept, Division=i.div, Subclass=i.subclass

    from SalesDate s inner join Item i on i.Item_SK=s.Item_SK where s.dateid>='20110201' and s.dateid<='20110228'

    go

    I've 28 queries for 3 years data (as I still didn't get the concept of how to use loop).

    Is there any way to improve this script so that it can be committed faster? I have created index on dateid and item_sk

  • An execution plan or two would help loads. I'd suggest an estimated plan for the query shown, and an actual plan for a shorter time period so you're not waiting all day for it. Post as .sqlplan attachments.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Also, actual plan for this query:

    SELECT

    i.Item_SK,

    [Class]= i.class,

    Department= i.dept,

    Division= i.div,

    Subclass= i.subclass,

    [Rows]= COUNT(*)

    FROM SalesDate s

    INNER JOIN Item i

    ON i.Item_SK = s.Item_SK

    WHERE s.dateid >= '20110201'

    AND s.dateid <= '20110228'

    GROUP BY i.Item_SK, i.class, i.dept, i.div, i.subclass

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • As chris suggested please post execution plan.

    By any change SalesDate table has any indexes if it is on which columns.

    Regards
    Durai Nagarajan

  • I've included few of them in attachment.

    I've 2 index on Salesdate table, both of them are non-clustered, one is for DateID and another one is for Short_SKU which is the key column that join with Item.

    By they way this machine is virtual machine that have 16 GB Ram, Intel(R) Xeon(R) CPU E5530 @ 2.40GHz with 8 Virtual processors. The only machine that is up on our Virtual Server and initially I thought it was HD performance problem and I've set Hard settings to _Independent - Persistant_ (if these info is helpful).

    Thanks for your help

  • Thanks for posting those. It looks to me that the indexing strategy for the two tables is a bit hit or miss, meaning there's plenty of scope for improvement.

    With tables of this size it's likely to take a while to create an index. Enterprise editions of SQL Server allow you to create an index without locking users out of the table. If you're not using Enterprise edition then you will have to set aside downtime to get the job done.

    You will almost certainly benefit from a clustered index on both tables - refer to this Microsoft paper to understand why. Choosing the best column(s) for a clustered index is important and not always straightforward. If you post up the DDL for the two tables then folks here will help you decide.

    It may be quicker to drop the existing indexes on the two tables then recreate them after creating the clustered index because the ordinary indexes will be updated to replace RIDs with cluster keys.

    Once you've got your clustered indexes in place you can work on the non-clustered indexes, and to do this effectively you need to know the pattern of usage of the indexes you already have and the index recommendations that SQL Server accumulates. These can be interrogated using the system DMV's and to save yourself time writing sensible queries from them, Glenn Berry has a very well known and widely respected suite of such queries here[/url]. Download the set for your server version and separate out the index-specific queries. The ones you should be looking for are:

    (Query 49) (Bad NC Indexes)

    (Query 50) (Missing Indexes)

    (Query 51) (Missing Index Warnings)

    Interpretation of the results of these queries takes a little practice, you may benefit from posting here and inviting recommendations.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply