Trying to optimize performance of query having Hash Matches & Agreegate

  • hi

  • Could you possibly post the plans as .sqlplan files? http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

    Are you sure you want a nested loop join? It is not always the best join type.

    http://sqlinthewild.co.za/index.php/2009/11/24/the-most-optimal-join-type/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail,

    As you can see in the Execution plan a large amount about 46% is being consumed by Hash joins.

    I am not sure if I want Nested loop joins or merge joins. I want to increase the performance of the query, I thought this is the place I need to modify.

    I am using sql 2000. So I could not save plan in .sqlplan format. The only option for me is to copy and paste onto spreadsheet...Sorry..

    Thank you

  • Can you please post the view definitions:

    vMBlockstoFinal1, vMBlockstoFinal2, vMBlockstoFinal3, vMBlockstoFinal4

    Thanks

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ..

  • First off, nesting views frequently leads to performance problems. I recommend against it.

    Based on the info in the spread sheet, it looks like you have out of date statistics. You've got missing stats warnings.

    I suspect the UNION operation in one of the views is leading to the DISTINCT operator, which is an aggregation.

    A lot of this is speculation based on the information at hand. An actual execution plan instead of an estimated plan would be useful.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (9/24/2010)


    Based on the info in the spread sheet, it looks like you have out of date statistics. You've got missing stats warnings.

    Or maybe auto_create_stats turned off?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Grant Fritchey (9/24/2010)


    First off, nesting views frequently leads to performance problems. I recommend against it.

    Especially when different levels of nesting have different sort criteria.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • GilaMonster (9/24/2010)


    Grant Fritchey (9/24/2010)


    Based on the info in the spread sheet, it looks like you have out of date statistics. You've got missing stats warnings.

    Or maybe auto_create_stats turned off?

    Yep, absolute possibility.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • ..

  • Jaya, I reckon the most cost-effective way to approach your problem at this point in time is to see how each of your queries performs when you sub in the view definitions and remove anything which isn't essential for the resultsets you need from this sproc. It shouldn't take you too long to do, I tried one of them last week and in five minutes it was almost there.

    SELECT --TOP 100 PERCENT

    si.BlockNr,

    si.SpecimenNr,

    si.KeyNr,

    0 AS Par5Avg,

    0 AS Par6Avg,

    si.Failure AS FailureMode,

    da.Par1,

    da.Par2,

    da.Par3,

    'SUB_Perf' AS ViewOrigin

    FROM dbo.vtbl_BlockNrSpecimenInformation si -- <<<<< is this a view too? If so, resolve it out to the view definition

    INNER JOIN (

    SELECT SpecimenID, Lock_Nr, Key_Nr, IQNr, Specimen_Nr, FailureMode, CityProduce, Slide_10Kind, Par1, Par2, Par3, Allow, RecordStamp

    FROM (

    SELECT --TOP 100 PERCENT

    SpecimenID, BlockNr AS Lock_Nr, KeyNr AS Key_Nr, SpecimenNr AS Specimen_Nr, IQNr, Failure AS FailureMode, RecordStamp,

    Allow, CityProduce, Value1 AS Par1, Value2 AS Par2, Value3 AS Par3, IQType AS Slide_10Kind

    FROM dbo.tbl_SpecimenInfoMajorPlace

    WHERE (CityProduce = N'Lathes') AND (Allow = - 1) AND (RecordStamp > GETDATE() - 730)

    --ORDER BY SpecimenID DESC

    ) d1

    UNION -- <<<<< is UNION ALL sufficient? Deduping may not be necessary.

    SELECT SpecimenID, Lock_Nr, Key_Nr, IQNr, Specimen_Nr, FailureMode, CityProduce, Slide_10Kind, Par1, Par2, Par3, Allow, RecordStamp

    FROM (

    SELECT ID AS SpecimenID, Lock_Nr, IqtNumber AS IQNr, Specimen_Nr, Par1, Par2, Par3, FailureMode, DateStamp AS RecordStamp, Key_Nr, ParRange,

    Slide_10Kind, 'na' AS CityProduce, - 1 AS Allow

    FROM dbo.Metrics_Test_Specimen_Data

    WHERE (DateStamp > GETDATE() - 1000)

    ) d2

    ) da

    ON si.KeyNr = da.Key_Nr

    AND si.SpecimenNr = da.Specimen_Nr

    WHERE (da.Allow = - 1)

    AND (si.City = N'Conversions')

    AND (si.Town = N'Perforator')

    AND (si.Failure IS NULL)

    AND (si.RecordStamp > GETDATE() - 365)

    --ORDER BY si.BlockNr, si.SpecimenNr

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • The estimated costs are just that, estimates. They're calculations based on the statistics available and the query being run. While they are the only number available to us when working with execution plans, they can't be taken too seriously and sometimes, they have to be ignored entirely.

    Just because the hash match is the most costly operation doesn't mean it should be eliminated. It's entirely possible, even likely, that based on the queries supplied, that's the best way to get the data that you need.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Chris,

    I shall do it that way. I actually found that one of the join was redundant and removed it and found performance improvement.

    Grant,

    Thanks for your reply. Shall not worry abt Hash joins anymore

  • I have a weired issue with my Record stamp filter in Index seek operation. The argument shows the following

    |--Index Seek(OBJECT:([HDIS].[dbo].[Metrics_Test_Specimen_Data].[idx_datecoveringcol]), SEEK:([Metrics_Test_Specimen_Data].[DateStamp] > getdate()-'Sep 28 1902 12:00AM') ORDERED FORWARD)

    I dont know why this weird date 1902 is apppering in my execution plan. I checked all my views & date limits. The oldest date that I have in any of my tables is from 2002.

    Please take a look at the highlighted records 6, 29,34,55 in the attached spreadsheet. You will see the date.

    Is it some kind of range or something? Do I need to worry about this?

    -Jaya

  • lallu_jaya (9/28/2010)


    Chris,

    I shall do it that way. I actually found that one of the join was redundant and removed it and found performance improvement.

    Grant,

    Thanks for your reply. Shall not worry abt Hash joins anymore

    Wait one... Don't get me wrong, hash joins are not automatically bad, but they're not automatically good either. You have to evaluate the query being run, the data being retrieved, and where the hash is within the execution plan to understand if it's a good thing or a bad thing.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply