Update Performance

  • Hi Guys,

    Any Ideas what the best way would be to perform this update?

    UPDATE

    D

    SETTrack_ID= P.Track_ID,

    Territory_ID= P.Territory_ID,

    Major= P.Major

    FROMStaging.Track_MappingP

    INNER JOINStaging.Track_PlayD

    ONP.ISRC= D.ISRC_Code

    ANDP.Event_Date= D.Event_Date

    Both tables have around 63 million rows and have numerous indexes.

    Any ideas appreciated

    Thanks,

    Rich

  • breaking it into small batches may help you.

    [font="Calibri"]Raj[/font]
  • Do you really want to update all the 63 million rows?

    Check if you can put some filter to reduce the number of rows that will be affected.


    Kingston Dhasian

    How to post data/code on a forum to get the best help - Jeff Moden
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Something like this can help, and it's easy to check;

    UPDATE D SET

    Track_ID = P.Track_ID,

    Territory_ID = P.Territory_ID,

    Major = P.Major

    FROM Staging.Track_Play D

    INNER JOIN Staging.Track_Mapping P

    ON P.ISRC = D.ISRC_Code

    AND P.Event_Date = D.Event_Date

    SELECT

    Track_ID = MAX(P.Track_ID),

    Territory_ID = MAX(P.Territory_ID),

    Major = MAX(P.Major),

    D.ISRC_Code,

    D.Event_Date

    INTO #Updater

    FROM Staging.Track_Play D

    INNER JOIN Staging.Track_Mapping P

    ON P.ISRC = D.ISRC_Code

    AND P.Event_Date = D.Event_Date

    GROUP BY D.ISRC_Code, D.Event_Date

    CREATE UNIQUE CLUSTERED INDEX cx_ISRC_Code_Event_Date ON #Updater (ISRC_Code,Event_Date)

    UPDATE D SET

    Track_ID = P.Track_ID,

    Territory_ID = P.Territory_ID,

    Major = P.Major

    FROM Staging.Track_Play D

    INNER JOIN #Updater P

    ON P.ISRC = D.ISRC_Code

    AND P.Event_Date = D.Event_Date

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (6/7/2013)


    Something like this can help, and it's easy to check;

    UPDATE D SET

    Track_ID = P.Track_ID,

    Territory_ID = P.Territory_ID,

    Major = P.Major

    FROM Staging.Track_Play D

    INNER JOIN Staging.Track_Mapping P

    ON P.ISRC = D.ISRC_Code

    AND P.Event_Date = D.Event_Date

    SELECT

    Track_ID = MAX(P.Track_ID),

    Territory_ID = MAX(P.Territory_ID),

    Major = MAX(P.Major),

    D.ISRC_Code,

    D.Event_Date

    INTO #Updater

    FROM Staging.Track_Play D

    INNER JOIN Staging.Track_Mapping P

    ON P.ISRC = D.ISRC_Code

    AND P.Event_Date = D.Event_Date

    GROUP BY D.ISRC_Code, D.Event_Date

    CREATE UNIQUE CLUSTERED INDEX cx_ISRC_Code_Event_Date ON #Updater (ISRC_Code,Event_Date)

    UPDATE D SET

    Track_ID = P.Track_ID,

    Territory_ID = P.Territory_ID,

    Major = P.Major

    FROM Staging.Track_Play D

    INNER JOIN #Updater P

    ON P.ISRC = D.ISRC_Code

    AND P.Event_Date = D.Event_Date

    I see a lot of extra and expensive operations... But where is a profit?


    Alex Suprun

  • Alexander Suprun (6/7/2013)


    ChrisM@Work (6/7/2013)


    Something like this can help, and it's easy to check;

    UPDATE D SET

    Track_ID = P.Track_ID,

    Territory_ID = P.Territory_ID,

    Major = P.Major

    FROM Staging.Track_Play D

    INNER JOIN Staging.Track_Mapping P

    ON P.ISRC = D.ISRC_Code

    AND P.Event_Date = D.Event_Date

    SELECT

    Track_ID = MAX(P.Track_ID),

    Territory_ID = MAX(P.Territory_ID),

    Major = MAX(P.Major),

    D.ISRC_Code,

    D.Event_Date

    INTO #Updater

    FROM Staging.Track_Play D

    INNER JOIN Staging.Track_Mapping P

    ON P.ISRC = D.ISRC_Code

    AND P.Event_Date = D.Event_Date

    GROUP BY D.ISRC_Code, D.Event_Date

    CREATE UNIQUE CLUSTERED INDEX cx_ISRC_Code_Event_Date ON #Updater (ISRC_Code,Event_Date)

    UPDATE D SET

    Track_ID = P.Track_ID,

    Territory_ID = P.Territory_ID,

    Major = P.Major

    FROM Staging.Track_Play D

    INNER JOIN #Updater P

    ON P.ISRC = D.ISRC_Code

    AND P.Event_Date = D.Event_Date

    I see a lot of extra and expensive operations... But where is a profit?

    Without an execution plan to examine I can't tell if there are any extra and expensive operations, neither can I determine if the mod I've suggested may be of any benefit. Sometimes it is.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Kingston Dhasian (6/7/2013)


    Do you really want to update all the 63 million rows?

    Check if you can put some filter to reduce the number of rows that will be affected.

    I agree with Kingston here, try and minimise the data set that you are updating so that you only update those rows that have changed, something like

    UPDATE

    D

    SET

    Track_ID = P.Track_ID,

    Territory_ID = P.Territory_ID,

    Major = P.Major

    FROM

    Staging.Track_Mapping P

    INNER JOIN Staging.Track_Play D

    ON P.ISRC = D.ISRC_Code

    AND P.Event_Date = D.Event_Date

    WHERE

    D.Track_ID <> P.Track_ID

    OR D.Territory_ID <> P.Territory_ID

    OR D.Major <> P.Major

    If you have to update all 63 million rows on the table, is there a reason why you cant populate the fields when you are loading the data into the table in the first place, a simple SSIS task with a lookup should allow you to add the data in or a statement with the INNER JOIN between the SOURCE feed and the Track_Mapping table should work.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • Thanks for all the help everyone. I have now found out what the user was actually trying to do and this update was to every row in the table, so I'm no longer doing and Update and using a insert into a separate table then truncating and inserting into the existing table instead.

    Thanks Again.

  • Richie T (6/7/2013)


    Hi Guys,

    Any Ideas what the best way would be to perform this update?

    UPDATE

    D

    SETTrack_ID= P.Track_ID,

    Territory_ID= P.Territory_ID,

    Major= P.Major

    FROMStaging.Track_MappingP

    INNER JOINStaging.Track_PlayD

    ONP.ISRC= D.ISRC_Code

    ANDP.Event_Date= D.Event_Date

    Both tables have around 63 million rows and have numerous indexes.

    Any ideas appreciated

    Thanks,

    Rich

    Numerous indexes are counter productive when it comes to performance of UPDATEs, because the more indexes there are, then the more data pages have to be updated and transaction logged. To facilitate this specific UPDATE, what you need is an index on ISRC_Code and Event_Date.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply