Query tuning help

  • Hi All,

    Query is taking more than 10 mins. Don't know whats happening. Seeking for inputs in tuning this particular query.

    My questions:

    ===========

    1) I dont see any waitinfo in sp_whoisactive. what it is doing ? Always status = runnable , what does it mean ? Avg cpu utilization is only 40-55% on the server. however. maxdop is setting is 1.

    2) How can we optimize this query to run fast?

    Collected below data and attaching everything in a zip file "TuneQry.zip"

    - Table and Index defintions - tables_ddl.txt

    - Query text - "src_qry"

    - Actual execution plan (collected from ssms) -"SSMSActualPlan.sqlplan"

    - set statistics io time on (collected from ssms) - "statsitics.txt"

    - xml plan (collected from ssms) - "xmlplan.xml"

    - Missing index details ( collected from ssms) -"missing idx.sql"

    - Actual Execution Plan ( collected using SQL Sentry Plan Explorer) -"Plan1.pesession"

    -Table sizes and row counts -"Tablesizes and rowcounts.PNG"

    -Index fragmentation details - "Index Fragmentation Details.PNG"

    -When was stats got updated for those tables - "stats info.PNG"

    -sp_configue settings -"sp_configure settings.PNG"

    -cpu info - "cpu_info.PNG"

    -memory info - "memory_info.PNG"

    Environment details

    =====================

    Windows

    =========

    Microsoft Windows Server 2012 R2 Datacenter

    SQL Server

    ==========

    Microsoft SQL Server 2012 (SP4) (KB4018073) - 11.0.7001.0 (X64)

    Aug 15 2017 10:23:29

    Copyright (c) Microsoft Corporation

    Enterprise Edition (64-bit) on Windows NT 6.3 <X64> (Build 9600: ) (Hypervisor)

    Regards,

    Sam

    Attachments:
    You must be logged in to view attached files.
  • Always makes wonder when I see a machine with 12 cores and maxdop = 1.

    Has it been proven that for this machine this is the correct setting or was this set as a "fix" for some queries that no one knew how to change?

    unless proven otherwise I would advise maxdop 4 and changing cost threshold to 50 - tune as needed

     

    regarding the time it took and the query - returning that much data onto SSMS will always slow it down - always insert into a temp table to check performance/timing

    Index recommended although it could help a small part of the query I do not think it would make that much of a difference on the overall performance so leave it as is for now - data volumes on that part of query are quite small.

    Small change on the query below - no longer using old join style and inserting into a temp table.

    Note that using new and old join style can affect the plan that SQL engine uses - not always but it can and I've seen cases where "old style" was faster.

    Also and as a test option run the query twice - second time I gave it a maxdop hint to see if it runs better

    some comments below

    if object_id('tempdb..#events') is not null
    drop table #events

    select a.notf_chg_evnt as notf_chg_evnt
    , a.notf_btch_dtl_id as notf_btch_dtl_id
    , a.chg_detct_time as chg_detct_time
    , a.reg_id as reg_id
    , b.duns_nbr as duns_nbr
    , a.elmnt_name as mntr_elmnt_name
    , d.col_seq_no as col_seq_no
    , d.db_col_name as db_col_name
    , d.db_col_datatype as db_col_datatype
    --, isnull(nullif(c.new_val, ''), '<NULL>') as new_val -- do this on interface or on retrieval from the temp table
    , c.new_val as new_val
    , e.attr_val as attr_val
    into #events
    from (select a.notf_chg_evnt
    , a.notf_btch_dtl_id
    , b.notf_prcs_job_id
    , a.reg_id
    , a.chg_detct_time
    , a.elmnt_name
    , a.elmnt_xpath
    , row_number() over (partition by a.reg_id, a.elmnt_name, a.elmnt_xpath, b.notf_prcs_job_id order by case
    when a.chg_detct_time is null
    then 1
    else 0
    end desc, a.chg_detct_time desc) as chg_evnt_rownum
    from DM_NOTF_CHG_EVNT a
    inner join DM_NOTF_BTCH_DTL b
    on a.notf_btch_dtl_id = b.notf_btch_dtl_id
    and b.notf_prcs_job_id = 167
    ) a
    inner join dm_reg_smry b
    on a.reg_id = b.reg_id -- reg_id is primary pk index column so its ok
    and a.chg_evnt_rownum = 1
    inner join dm_notf_chg_evnt_attr c
    -- should this not be the id field instead? NOTF_CHG_EVNT_ID
    on a.notf_chg_evnt = c.notf_chg_evnt
    inner join dm_lkp_landing_table_attr d
    -- above is a varchar(3000) - I advise that a hash (sha1) is created on both tables, index created on it so joins can be faster
    on c.elmnt_xpath = d.xpath_values
    inner join dm_lkp_attr e
    on e.attr_name = 'CHG_TYP'
    and e.attr_id = c.chg_typ_id -- attr_id is primary pk index column so its ok
    inner join dm_mntr_profile f
    on d.product_id = f.mp_dnb_product_id
    and b.mp_id = f.mp_id -- mp_id is primary pk index column so its ok
    --option (maxdop 4) -- on the second run try with this option uncommented
  • Sam,

    Rewriting to use ANSI Joins instead of older Equi-Joins isn't going to fix it nor the apparent total absence of any indexes to support the code.  Returning more than 600 thousand rows to the screen might provide your first clue. All Clustered Index Scans would be your second.

    My personal opinion is that you are now taking advantage of the generosity of others.  You folks didn't even try to fix this code on your own.  If you don't have anyone that knows how to fix this kind of thing, it's high time you hired someone that actually does.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Change the query to this:

    SELECT a.notf_chg_evnt AS notf_chg_evnt, 
    a.notf_btch_dtl_id AS notf_btch_dtl_id,
    a.chg_detct_time AS chg_detct_time,
    a.reg_id AS reg_id,
    b.duns_nbr AS duns_nbr,
    a.elmnt_name AS mntr_elmnt_name,
    d.col_seq_no AS col_seq_no,
    d.db_col_name AS db_col_name,
    d.db_col_datatype AS db_col_datatype,
    ISNULL(NULLIF(c.new_val, ''), '<NULL>') AS new_val,
    E.ATTR_VAL AS attr_val
    FROM dm_reg_smry b
    INNER JOIN dm_notf_chg_evnt_attr c
    ON c.notf_chg_evnt = a.notf_chg_evnt
    INNER JOIN dm_lkp_landing_table_attr d
    ON d.xpath_values = c.elmnt_xpath
    INNER JOIN dm_lkp_attr e
    ON e.attr_name = 'CHG_TYP'
    AND e.attr_id = c.chg_typ_id
    INNER JOIN dm_mntr_profile f
    ON f.mp_id = b.mp_id
    AND f.mp_dnb_product_id = d.product_id
    CROSS APPLY (SELECT TOP(1)
    a.notf_chg_evnt,
    a.notf_btch_dtl_id,
    z.notf_prcs_job_id,
    a.reg_id,
    a.chg_detct_time,
    a.elmnt_name,
    a.elmnt_xpath
    FROM DM_NOTF_CHG_EVNT a
    INNER JOIN DM_NOTF_BTCH_DTL z
    ON z.notf_prcs_job_id = 167
    AND z.notf_btch_dtl_id = a.notf_btch_dtl_id
    AND a.reg_id = b.reg_id
    AND a.notf_chg_evnt = c.notf_chg_evnt
    ORDER BY CASE WHEN a.chg_detct_time IS NULL THEN 1 ELSE 0 END DESC, a.chg_detct_time DESC
    ) AS a

    And add these 2 indexes:

    CREATE INDEX IX_DM_NOTF_BTCH_DTL_1 ON dbo.DM_NOTF_BTCH_DTL(notf_prcs_job_id, notf_btch_dtl_id, chg_detct_time DESC);

    CREATE INDEX IX_DM_NOTF_CHG_EVNT_1 ON dbo.DM_NOTF_CHG_EVNT(reg_id, notf_chg_evnt)
    INCLUDE (notf_btch_dtl_id,chg_detct_time,elmnt_name,elmnt_xpath);
  • Jeff Moden wrote:

    Sam,

    Rewriting to use ANSI Joins instead of older Equi-Joins isn't going to fix it nor the apparent total absence of any indexes to support the code.  Returning more than 600 thousand rows to the screen might provide your first clue. All Clustered Index Scans would be your second.

    My personal opinion is that you are now taking advantage of the generosity of others.  You folks didn't even try to fix this code on your own.  If you don't have anyone that knows how to fix this kind of thing, it's high time you hired someone that actually does.

    Please don't take me wrong.  I am just trying to get piece of advise from experts. no other intention. Trying few things at our end like using temp tables etc... Only intention was if I can provide right information so that I get right advice. I am not a query tuning expert. However, I know the basics of query plan and indexing. If anyone guides, I will try to understand, test it and make few suggestions.

    This code was written by the app team, we just trying to give them support that team as other teams.  It is very clear that there is something wrong with the query which I really don't know why it is slow but trying to help as much as I can. I really don't know if DBA can straight out say to the DEV team that we can't help, there is something wrong with your code and that's the reason why the query is running slow and please go and fix it.

    When a query is running slow , I normally check for blocking, waittype based on that I try to troubleshoot further. Here it is not showing any waitype and status is always runnable , I dont know why?

     

     

     

  • Jonathan AC Roberts wrote:

    Change the query to this:

    SELECT a.notf_chg_evnt AS notf_chg_evnt, 
    a.notf_btch_dtl_id AS notf_btch_dtl_id,
    a.chg_detct_time AS chg_detct_time,
    a.reg_id AS reg_id,
    b.duns_nbr AS duns_nbr,
    a.elmnt_name AS mntr_elmnt_name,
    d.col_seq_no AS col_seq_no,
    d.db_col_name AS db_col_name,
    d.db_col_datatype AS db_col_datatype,
    ISNULL(NULLIF(c.new_val, ''), '<NULL>') AS new_val,
    E.ATTR_VAL AS attr_val
    FROM dm_reg_smry b
    INNER JOIN dm_notf_chg_evnt_attr c
    ON c.notf_chg_evnt = a.notf_chg_evnt
    INNER JOIN dm_lkp_landing_table_attr d
    ON d.xpath_values = c.elmnt_xpath
    INNER JOIN dm_lkp_attr e
    ON e.attr_name = 'CHG_TYP'
    AND e.attr_id = c.chg_typ_id
    INNER JOIN dm_mntr_profile f
    ON f.mp_id = b.mp_id
    AND f.mp_dnb_product_id = d.product_id
    CROSS APPLY (SELECT TOP(1)
    a.notf_chg_evnt,
    a.notf_btch_dtl_id,
    z.notf_prcs_job_id,
    a.reg_id,
    a.chg_detct_time,
    a.elmnt_name,
    a.elmnt_xpath
    FROM DM_NOTF_CHG_EVNT a
    INNER JOIN DM_NOTF_BTCH_DTL z
    ON z.notf_prcs_job_id = 167
    AND z.notf_btch_dtl_id = a.notf_btch_dtl_id
    AND a.reg_id = b.reg_id
    AND a.notf_chg_evnt = c.notf_chg_evnt
    ORDER BY CASE WHEN a.chg_detct_time IS NULL THEN 1 ELSE 0 END DESC, a.chg_detct_time DESC
    ) AS a

    And add these 2 indexes:

    CREATE INDEX IX_DM_NOTF_BTCH_DTL_1 ON dbo.DM_NOTF_BTCH_DTL(notf_prcs_job_id, notf_btch_dtl_id, chg_detct_time DESC);

    CREATE INDEX IX_DM_NOTF_CHG_EVNT_1 ON dbo.DM_NOTF_CHG_EVNT(reg_id, notf_chg_evnt)
    INCLUDE (notf_btch_dtl_id,chg_detct_time,elmnt_name,elmnt_xpath);

    Sir, Can you please let me know why it is showing "runnable" and not showing any "waitype" ? what could be the reason.

     

  • How can we set the right expectations to the Development team on such situations, what can be done by the DBA and how to convey the application team in a nice polite way, that's it their baby and they have to fix it. We can definitely see if there is any Blocking, known waits, index fragmentation , check the stats if there are out of date and update them with Full scan and see if there is any difference.

    Beyond that if it comes on to me personally, its kind of stressful situation for me. Thats why I try to seek for help.

  • You really wonder why your query is slow?  Have you actually looked at the execution plan?

    Issues I found after a quick look at some of the information you provided.

    1. You did not include the DDL for what appears to be the largest table in your query, dm_lkp_landing_table_attr.  Based on the actual execution plan Plan1.pesession, this table has over a billion rows of data and you are doing a clustered index scan (a table scan) on this table.
    2.  Looking at this same plan I am seeing a large disparity between estimated and actual counts.  This says a lot as well, no or inappropriate indexes, stale statistics.
    3.  There are two actual filter clauses in your query.  One in the derived table and one in the external table.
    4.  In fact, based on the plan, the query is doing clustered index scans on every table.

    These few issues indicate that there are things that need to be changed.  With one table having over a billion rows of data and another over a million (almost 3 million) says much.  What none of us has, however, is actual knowledge of the data and what this query is attempting to accomplish.

    To be honest, you may want to look at bringing in a consultant with performance tuning experience so that they can actually sit down with you and your developers and actually look at and touch the database to see what could be done to improve the query.  It really comes down to knowing the data, and there appears to be a lot of data that needs to be processed.

     

  • The execution plan is recommending you create the following indexes:

    CREATE INDEX IX_DM_NOTF_CHG_EVNT_1 ON [DNB_MDM_MON].[dbo].[DM_NOTF_CHG_EVNT_ATTR]
    (
    [NOTF_CHG_EVNT]
    )
    INCLUDE
    (
    [ELMNT_XPATH],
    [CHG_TYP_ID],
    [NEW_VAL]
    );

    CREATE INDEX IX_DM_NOTF_CHG_EVNT_1 ON [DNB_MDM_MON].[dbo].[DM_NOTF_CHG_EVNT]
    (
    [NOTF_BTCH_DTL_ID]
    )
    INCLUDE
    (
    [NOTF_CHG_EVNT],
    [REG_ID],
    [CHG_DETCT_TIME],
    [ELMNT_NAME],
    [ELMNT_XPATH]
    );

    Have you tried these?

     

  • Thank you Lynn and Jonathan.

  • Sam - did you read my comments?

    Adding those indexes are not necessarily what needs to be done (as recommendations from the engine are just that)

    and is the field I mentioned as being a possible incorrect column to join the correct one? or is the one I mentioned as alternative the right one to use?

     

    As mentioned by others you and your team do need to learn how to look at queries/plans/stats and figure out what is wrong and what can be changed. although here we try to help what we state is not always the best option (even if it works better than the original code/setup)

    For example on this particular case it could be that a temp table with part of the query would give better results - but for some of this it would be try and see as "it depends"

  • A strong suggestion.

    Listen to what Lynn has said. Your best bet would be to get someone who knows what they're doing in to teach you. Not to have them do the tuning, but to have them show you how to do it. That's what's needed here.

    A second suggestion (a very far second, Lynn is right), would be to get my books and start teaching yourself. Get the query tuning book, turn to the last chapter and start working through the checklist against your queries. Follow the links in the checklist back into the book (read it backwards in other words). This would be the quickest way to start to get value immediately (not the best way to learn though).

    However, again, Lynn is right. You should get someone in to train you up. That's going to be the fastest way to resolve the issues you're clearly facing. I say this because, if you're getting scans on every single table in this query, chances are high, this is a pattern. I'll be all your queries resemble this one, right?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • frederico_fonseca wrote:

    Sam - did you read my comments?

    Adding those indexes are not necessarily what needs to be done (as recommendations from the engine are just that)

    and is the field I mentioned as being a possible incorrect column to join the correct one? or is the one I mentioned as alternative the right one to use?

    As mentioned by others you and your team do need to learn how to look at queries/plans/stats and figure out what is wrong and what can be changed. although here we try to help what we state is not always the best option (even if it works better than the original code/setup)

    For example on this particular case it could be that a temp table with part of the query would give better results - but for some of this it would be try and see as "it depends"

    Hi Fedrico,

    I have gone through the points each and everyone has provided. Currently, working with App team on making the changes. It will take time. The agreed to rewrite the query. At first glance I saw huge table scans with and row estimates are way off. They will also be changing code with ANSI join implementations. if they still see row estimate issues, they might use creation of temp table(s) to make the join faster.

    I totally agree with everyone's opinion. I don't deny. I am still learning. Whatever been said here I totally get it. However, I can't rewrite the queries as I am unaware of business logic.  Slowly I am trying step to step to face these scenarios and different situations within teams. I am an accidental DBA. Learning slowly. Gathering basic information table structures, how indexes are laid out. fragmentation, stats etc .. Also,  I am trying to familiar with execution plans , looking for high cost operators , missing index recommendations, updated stats by looking the skew of estimated vs actual no of rows, thick lines , sort operator, implicit conversions , tempdb spills etc... I need some more time or experience to get to that confidence and reach that level.

    We are small shop and we don't get to hire any dedicated consultant for performance tuning.  Everything has to be done by us!

    Thanks everyone. Forgive me if I have asked too much or hurt anyone.

     

  • vsamantha35 wrote:

    Thanks everyone. Forgive me if I have asked too much or hurt anyone.

    No, please. Ask away. It's just so clear that you're over your head and suffering. We're trying to help.

    A piece of advice. You say you don't know the business needs. Learn those too. Understanding the business radically changes how you support it. For example, at the insurance company I used to work at, I didn't know that renewals took place in February and March. All I knew was, suddenly, in February, servers that previously had been practically asleep were now almost on fire because of high volume loads. If I'd known ahead of time, I could have anticipated the problem rather than spending a week or so doping out what was happening and then having to figure out how to deal with it after everything is already bad.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • vsamantha35 wrote:

    frederico_fonseca wrote:

    Sam - did you read my comments?

    Adding those indexes are not necessarily what needs to be done (as recommendations from the engine are just that)

    and is the field I mentioned as being a possible incorrect column to join the correct one? or is the one I mentioned as alternative the right one to use?

    As mentioned by others you and your team do need to learn how to look at queries/plans/stats and figure out what is wrong and what can be changed. although here we try to help what we state is not always the best option (even if it works better than the original code/setup)

    For example on this particular case it could be that a temp table with part of the query would give better results - but for some of this it would be try and see as "it depends"

    Hi Fedrico,

    I have gone through the points each and everyone has provided. Currently, working with App team on making the changes. It will take time. The agreed to rewrite the query. At first glance I saw huge table scans with and row estimates are way off. They will also be changing code with ANSI join implementations. if they still see row estimate issues, they might use creation of temp table(s) to make the join faster.

    I totally agree with everyone's opinion. I don't deny. I am still learning. Whatever been said here I totally get it. However, I can't rewrite the queries as I am unaware of business logic.  Slowly I am trying step to step to face these scenarios and different situations within teams. I am an accidental DBA. Learning slowly. Gathering basic information table structures, how indexes are laid out. fragmentation, stats etc .. Also,  I am trying to familiar with execution plans , looking for high cost operators , missing index recommendations, updated stats by looking the skew of estimated vs actual no of rows, thick lines , sort operator, implicit conversions , tempdb spills etc... I need some more time or experience to get to that confidence and reach that level.

    We are small shop and we don't get to hire any dedicated consultant for performance tuning.  Everything has to be done by us!

    Thanks everyone. Forgive me if I have asked too much or hurt anyone.

     

    As Grant said, keep asking questions.  When we make a suggestion like getting a consultant it is because we feel that is the best way for you to get the help you need.  We can still provide assistance, just realize that we are unpaid volunteers with our own jobs that we have to work as well so we have to find the time to assist which usually means after hours or if we are lucky during slow times at work (not likely considering the project I am working at the moment).

    I would suggest filling in the gaps regarding the DDL that is missing.  There are also some other things I may suggest that could be counter-intuitive but I need time to write those up and really need the DDL that one missing table that happens to have more than a billion rows of data.

    As I have time I will continue to look at your problem, as I am sure others will.  Let us know if there are any other questions we may be able to answer that could benefit your education in SQL.

     

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply