Improve Performance -Update query with OR in WHERE

  • Hi! How do we improve this Update query Performance as currently it is taking 1 Minute to update 17643 Rows.

    Query ->

    Declare @date datetime = '09-Dec-2013'

    update #tmp_sp_abc

    set test = rtrim(xyz_test)

    from #tmp_sp_abc t1,

    t_h_bg_pc_it t2

    where (t2.id_i = t1.i or t2.id_s1 = t1.s)

    and t1.r_type = 1

    and t2.[date] = @date

    Tables Row Count: -

    #tmp_sp_abc -> 125352

    t_h_bg_pc_it -> 14798 Rows

    t_h_bg_pc_it table has 300 columns with primary key on id_i column

    and

    #tmp_sp_abc has 11 columns with no primary key and no indexes.

    found that "OR" condition is the root cause of this much time consumption but, can't change it.

    tried to add indexes on: -

    Table: - t_h_bg_pc_it

    Columns: - [xyz_test], [id_i], [id_s1], [date]

    Table: - #tmp_sp_abc

    Columns: - , , [r_type] include [test]

    but, by doing this saved only 5 seconds.

    Attaching the Execution Plan Snaps (Without above indexes and with indexes).

    Please advice.

  • Could you post the execution plans please, not pictures of the plans?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • what if you change the Or to be two distinct update statemetns instead?

    DECLARE @date DATETIME = '09-Dec-2013'

    --part1

    UPDATE #tmp_sp_abc

    SET test = RTRIM(xyz_test)

    FROM

    #tmp_sp_abc t1,

    t_h_bg_pc_it t2

    WHERE t2.id_i = t1.i

    AND t1.r_type = 1

    AND t2.[date] = @date

    --part2

    UPDATE #tmp_sp_abc

    SET test = RTRIM(xyz_test)

    FROM

    #tmp_sp_abc t1,

    t_h_bg_pc_it t2

    WHERE t2.id_s1 = t1.s

    AND t1.r_type = 1

    AND t2.[date] = @date

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Hi!

    First of all can't separate the OR. and tried this but, getting diff. in output of 5417 rows (may be due to some rows have values in both columns. That can be manageable).

    But, biz. guys are not allowing to user 2 UPDATE statements.

  • bharat sethi (12/18/2013)


    Hi!

    First of all can't separate the OR. and tried this but, getting diff. in output of 5417 rows (may be due to some rows have values in both columns. That can be manageable).

    But, biz. guys are not allowing to user 2 UPDATE statements.

    Sometimes the performance of this type of join can be improved by left joining the table twice and filtering for a match in at least one of them, like this:

    UPDATE t1 SET

    test = RTRIM(ISNULL(t2.xyz_test,t3.xyz_test))

    FROM #tmp_sp_abc t1

    LEFT JOIN t_h_bg_pc_it t2

    ON t2.id_i = t1.i

    AND t2.[date] = @date

    LEFT JOIN t_h_bg_pc_it t3

    ON t3.id_s1 = t1.s

    AND t3.[date] = @date

    WHERE t1.r_type = 1

    AND (t2.id_i IS NOT NULL OR t3.id_s1 IS NOT NULL)

    However, without an execution plan to work with, it's a stab in the dark. I'd try creating a clustered index on t1.i and an ordinary index on t1.s. Also, always run the query as a SELECT first and check the execution plan:

    SELECT

    test,

    RTRIM(ISNULL(t2.xyz_test,t3.xyz_test))

    FROM #tmp_sp_abc t1

    LEFT JOIN t_h_bg_pc_it t2

    ON t2.id_i = t1.i

    AND t2.[date] = @date

    LEFT JOIN t_h_bg_pc_it t3

    ON t3.id_s1 = t1.s

    AND t3.[date] = @date

    WHERE t1.r_type = 1

    AND (t2.id_i IS NOT NULL OR t3.id_s1 IS NOT NULL)

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • PFA the ExecutionPlan and please advice.

  • Here's an alternative query form which reduces the row count from the permanent table:

    UPDATE t1 SET

    test = RTRIM(x.xyz_test)

    FROM #tmp_sp_abc t1

    CROSS APPLY (

    SELECT TOP 1 xyz_test

    FROM t_h_bg_pc_it t2

    WHERE (t2.id_i = t1.i OR t2.id_s1 = t1.s)

    AND t2.[date] = @date

    -- choose a sensible ORDER BY

    ) x

    WHERE t1.r_type = 1

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi!

    Please assist that "Order By" clause is necessary in this query and is there any wrong outcome if we do not use this Order by clause? Please share the info. on this and advise do we need to use the Columns in Order By which we have in WHERE clause? Need to know so that will take care of this.

  • Hi,

    Create another index as below:

    Table: - #tmp_sp_abc

    Columns: - , , [r_type] include [test]

    Also, you can create an indexed view on [t_h_bg_pc_it] with only required columns. You need to consider how often [t_h_bg_pc_it] is being updated as Indexed view may slow down updates.

    Index on [t_h_bg_pc_it] table, should start with columns that you use in Where clause and include columns that you use in Set clause.


    Kindest Regards,

    darshan Joshi

  • bharat sethi (12/18/2013)


    Hi!

    Please assist that "Order By" clause is necessary in this query and is there any wrong outcome if we do not use this Order by clause? Please share the info. on this and advise do we need to use the Columns in Order By which we have in WHERE clause? Need to know so that will take care of this.

    The CROSS APPLY block resolves the case where the temp table has multiple matches in t_h_bg_pc_it using your chosen predicates. If there are no multiple matches then you don't need TOP or ORDER BY. If there are multiple matches then you should examine them. If the values of xyz_test are identical for each row then retain TOP but you don't strictly need ORDER BY (I'd use it in any case - match the ordering of the index/cluster you're reading from to avoid an unnecessary sort). If the values of xyz_test are not identical for each row then ORDER BY offers you an opportunity to choose between them.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply