Long running insert

  • Hi!

    Having problem with an insert that take very long time to complete, there is a warning about Type Conversion in expression and I think it could be the problem?

    Here is the Execution plan:

    https://www.brentozar.com/pastetheplan/?id=SkAwMn0Qg

    Any hint is appreciated.

  • I think it is not related about type conversion.

    - You used a lot of non-sargable conditions a.dim_1 = substring(b.dim_value,1,2)

    -This subquery table index

    "

    SELECT c.description FROM agldimvalue c WHERE c.attribute_id = 'NH'

    AND c.dim_value = 'PER'

    AND c.client = 'KI'

    "

    not cover all fields it uses RID lookup operator

    Actullay you have to change whole query for good performance..

  • Thanks for the tip, will talk to the people who wrote the script.

    /Johan

  • Oh, there is WAY more wrong with this query (from an optimization standpoint) than just a lot of substrings. At least for those you can replace them with LIKE for the case where they start on the first character.

    There are NOT INs, which are tough.

    ORs, which are also tough.

    A Date >= (some derived date), which could be 0 rows or a billion.

    You are aggregating and doing a post-aggregation HAVING.

    This is an estimated plan that "guesses" there will be few rows. If there are large numbers those nested loop joins will crush you with logical IO. You have complete dynamic SQL here, so the optimizer has it's best chance at the "guess" it will be basically forced to do with all of that stuff going on, so a common trick here of OPTION (RECOMPILE) won't help.

    You compare a date to a description field.

    I have no idea via paste the plan if you have any IMPLICIT_CONVERSIONS, but I would certainly not be surprised to find those too.

    Start by replacing SUBSTRING (field, 1, ..) with LIKE. At least that is SARGable.

    Use sp_whoisactive while it is running to see what work is being done. If you are on a late enough patch you can also see live query stats in 2014, which could REALLY help here (as could the ACTUAL execution plan.

    Oh, are there any Foreign Keys or triggers on the table you are INSERTing into? Did you check for blocking while it is running? sp_whoisactive will show that too.

    BUT WAIT - THERE's MORE!! :w00t: You also have non-standard JOIN syntax. And HEAP tables. This query, and your entire database and application design, need some SUBSTANTIAL help!!!

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • In addition to Kevin's comments, you also have potential for unexpected results.

    SELECT CAST(RTRIM( b.description ) AS FLOAT) AS ohproc,

    ...

    Group by b.description

    However many spaces you're trimming are still included when grouping.

    Values:

    '123.45___'

    123.45__'

    123.45'

    would all appear as 123.45 in the results, by you would have distinct lines for each.

    One negative of the old join syntax is that a non-SARGable expression is demoted to the where clause even if it should be the join.

    This "AND a.dim_1 = SUBSTRING( b.dim_value, 1, 2 )" would happen much later in the query. You may see improvements just by changing your join syntax and using an ON clause.

    I think your performance would be improved by separating the work. With all of the things you have going on here, I would filter A & B into temp tables before you join.

    SELECT a.client,

    a.dim_1,

    a.dim_3,

    a.dim_7,

    a.period,

    a.amount

    INTO #a

    FROM agltransact AS a

    WHERE a.client = 'A1'

    AND a.period >= ( SELECT c.description

    FROM agldimvalue AS c

    WHERE c.attribute_id = 'NH'

    AND c.dim_value = 'PER'

    AND c.client = 'KI'

    )

    AND a.account > '4'

    AND a.account < '59'

    AND a.account NOT LIKE '50%'

    AND a.account NOT LIKE '499%'

    AND a.account NOT LIKE '____9'

    AND a.account NOT IN ( '49408', '49418', '5331', '5332', '57317', '57367', '57311', '57848', '5785', '57851', '5800' )

    AND SUBSTRING( a.dim_7, 1, 1 ) IN ( '1', '4', '5' )

    AND SUBSTRING( a.dim_7, 1, 2 )NOT IN ( '53' )

    AND a.dim_7 NOT IN ( '18', '160', '521' )

    AND SUBSTRING( a.dim_3, 3, 1 ) != '9'

    AND a.dim_1 NOT IN ( 'CA', 'CB', 'UF', 'UG', 'UI', 'US' )

    AND ( a.account = '57971'

    OR a.dim_6 NOT IN ( 'BTH', 'DH', 'DI', 'FHS', 'GU', 'HBOR', 'HDAL', 'HGO', 'HH', 'HIG', 'HIS', 'HKAL', 'HKR', 'HTU', 'IHS', 'KAU', 'KF', 'KKH',

    'KMH', 'KTH', 'LHS', 'LINU', 'LIU', 'LTU', 'LU', 'MAH', 'MDH', 'MH', 'OHS', 'ORU', 'SH', 'SKH', 'SLU', 'STDH', 'SU', 'TH',

    'UMU', 'UU', 'VXU'

    )

    );

    SELECT CAST(RTRIM( b.description ) AS FLOAT) AS ohproc,

    b.client,

    b.dim_value,

    b.period_from,

    b.period_to

    INTO #b

    FROM agldimvalue AS b

    WHERE b.client = 'A1'

    AND b.attribute_id = 'NH'

    AND SUBSTRING( b.dim_value, 4, 3 ) = 'GU2'

    AND a.period BETWEEN b.period_from

    AND b.period_to;

    INSERT INTO kiohuttaga2 ( client,

    account,

    dim_1,

    dim_3,

    dim_7,

    period,

    costs,

    ohproc,

    ohcost

    )

    SELECT a.client,

    '49978' AS account,

    a.dim_1,

    a.dim_3,

    a.dim_7,

    a.period,

    SUM( a.amount ) AS costs,

    b.ohproc,

    ROUND( SUM( a.amount ) * b.ohproc, 2 ) AS ohcost

    FROM #a as a

    INNER JOIN #b as b

    ON a.dim_1 = SUBSTRING( b.dim_value, 1, 2 )

    AND a.period BETWEEN b.period_from

    AND b.period_to

    GROUP BY a.client,

    a.dim_1,

    a.dim_3,

    a.dim_7,

    a.period,

    b.description

    HAVING SUM( a.amount ) <> 0;

    Wes
    (A solid design is always preferable to a creative workaround)

  • If you go henriksen's route, a potentially nice improvement would be to substring the field used in this JOIN as you populate table #b:

    ON a.dim_1 = SUBSTRING( b.dim_value, 1, 2 )

    Then you would have a direct equality JOIN there.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • This is one of the cases when the performance problem comes from the DDL. You could create computed columns and lookup tables to improve this query. However, that's a suggestion that I wouldn't dare to complete on a forum post as it needs further analysis.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thanks for the input and thoughts!

  • Johan Fredin - Thursday, December 15, 2016 3:51 AM

    Thanks for the input and thoughts!

    UPDATE!
    We manage to fix this, removed the ORs and some other improvments also added a clustered index. Now it takes seconds instead of up to 30 minutes.
    Thanks for all the help!

  • Johan Fredin - Thursday, January 26, 2017 7:02 AM

    Johan Fredin - Thursday, December 15, 2016 3:51 AM

    Thanks for the input and thoughts!

    UPDATE!
    We manage to fix this, removed the ORs and some other improvments also added a clustered index. Now it takes seconds instead of up to 30 minutes.
    Thanks for all the help!

    Glad to hear you got the execution time down.  If you can, I'd be interested in seeing the final solution.  🙂

    Wes
    (A solid design is always preferable to a creative workaround)

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply