Trying to optimize performance of query having Hash Matches & Agreegate

Question

Post reply

Trying to optimize performance of query having Hash Matches & Agreegate

jaya

SSCrazy

Points: 2109
More actions
September 23, 2010 at 3:42 am

#145400

hi

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Could you possibly post the plans as .sqlplan files? http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Are you sure you want a nested loop join? It is not always the best join type.

http://sqlinthewild.co.za/index.php/2009/11/24/the-most-optimal-join-type/

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

jaya SSCrazy Points: 2109 More actions · Answer 2

Hi Gail,

As you can see in the Execution plan a large amount about 46% is being consumed by Hash joins.

I am not sure if I want Nested loop joins or merge joins. I want to increase the performance of the query, I thought this is the place I need to modify.

I am using sql 2000. So I could not save plan in .sqlplan format. The only option for me is to copy and paste onto spreadsheet...Sorry..

Thank you

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

Can you please post the view definitions:

vMBlockstoFinal1, vMBlockstoFinal2, vMBlockstoFinal3, vMBlockstoFinal4

Thanks

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

jaya SSCrazy Points: 2109 More actions · Answer 4

jaya

SSCrazy

Points: 2109

September 24, 2010 at 6:42 am

#1226911

..

Grant Fritchey SSC Guru Points: 398679 More actions · Answer 5

First off, nesting views frequently leads to performance problems. I recommend against it.

Based on the info in the spread sheet, it looks like you have out of date statistics. You've got missing stats warnings.

I suspect the UNION operation in one of the views is leading to the DISTINCT operator, which is an aggregation.

A lot of this is speculation based on the information at hand. An actual execution plan instead of an estimated plan would be useful.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

Grant Fritchey (9/24/2010)
Based on the info in the spread sheet, it looks like you have out of date statistics. You've got missing stats warnings.

Or maybe auto_create_stats turned off?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 7

Grant Fritchey (9/24/2010)
First off, nesting views frequently leads to performance problems. I recommend against it.

Especially when different levels of nesting have different sort criteria.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Grant Fritchey SSC Guru Points: 398679 More actions · Answer 8

GilaMonster (9/24/2010)
Grant Fritchey (9/24/2010)
Based on the info in the spread sheet, it looks like you have out of date statistics. You've got missing stats warnings.
Or maybe auto_create_stats turned off?

Yep, absolute possibility.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

jaya SSCrazy Points: 2109 More actions · Answer 9

jaya

SSCrazy

Points: 2109

September 27, 2010 at 8:44 pm

#1228124

..

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 10

Jaya, I reckon the most cost-effective way to approach your problem at this point in time is to see how each of your queries performs when you sub in the view definitions and remove anything which isn't essential for the resultsets you need from this sproc. It shouldn't take you too long to do, I tried one of them last week and in five minutes it was almost there.

SELECT --TOP 100 PERCENT

si.BlockNr,

si.SpecimenNr,

si.KeyNr,

0 AS Par5Avg,

0 AS Par6Avg,

si.Failure AS FailureMode,

da.Par1,

da.Par2,

da.Par3,

'SUB_Perf' AS ViewOrigin

FROM dbo.vtbl_BlockNrSpecimenInformation si -- <<<<< is this a view too? If so, resolve it out to the view definition

INNER JOIN (

SELECT SpecimenID, Lock_Nr, Key_Nr, IQNr, Specimen_Nr, FailureMode, CityProduce, Slide_10Kind, Par1, Par2, Par3, Allow, RecordStamp

FROM (

SELECT --TOP 100 PERCENT

SpecimenID, BlockNr AS Lock_Nr, KeyNr AS Key_Nr, SpecimenNr AS Specimen_Nr, IQNr, Failure AS FailureMode, RecordStamp,

Allow, CityProduce, Value1 AS Par1, Value2 AS Par2, Value3 AS Par3, IQType AS Slide_10Kind

FROM dbo.tbl_SpecimenInfoMajorPlace

WHERE (CityProduce = N'Lathes') AND (Allow = - 1) AND (RecordStamp > GETDATE() - 730)

--ORDER BY SpecimenID DESC

) d1

UNION -- <<<<< is UNION ALL sufficient? Deduping may not be necessary.

SELECT SpecimenID, Lock_Nr, Key_Nr, IQNr, Specimen_Nr, FailureMode, CityProduce, Slide_10Kind, Par1, Par2, Par3, Allow, RecordStamp

FROM (

SELECT ID AS SpecimenID, Lock_Nr, IqtNumber AS IQNr, Specimen_Nr, Par1, Par2, Par3, FailureMode, DateStamp AS RecordStamp, Key_Nr, ParRange,

Slide_10Kind, 'na' AS CityProduce, - 1 AS Allow

FROM dbo.Metrics_Test_Specimen_Data

WHERE (DateStamp > GETDATE() - 1000)

) d2

) da

ON si.KeyNr = da.Key_Nr

AND si.SpecimenNr = da.Specimen_Nr

WHERE (da.Allow = - 1)

AND (si.City = N'Conversions')

AND (si.Town = N'Perforator')

AND (si.Failure IS NULL)

AND (si.RecordStamp > GETDATE() - 365)

--ORDER BY si.BlockNr, si.SpecimenNr

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Grant Fritchey SSC Guru Points: 398679 More actions · Answer 11

The estimated costs are just that, estimates. They're calculations based on the statistics available and the query being run. While they are the only number available to us when working with execution plans, they can't be taken too seriously and sometimes, they have to be ignored entirely.

Just because the hash match is the most costly operation doesn't mean it should be eliminated. It's entirely possible, even likely, that based on the queries supplied, that's the best way to get the data that you need.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

jaya SSCrazy Points: 2109 More actions · Answer 12

Chris,

I shall do it that way. I actually found that one of the join was redundant and removed it and found performance improvement.

Grant,

Thanks for your reply. Shall not worry abt Hash joins anymore

jaya SSCrazy Points: 2109 More actions · Answer 13

I have a weired issue with my Record stamp filter in Index seek operation. The argument shows the following

|--Index Seek(OBJECT:([HDIS].[dbo].[Metrics_Test_Specimen_Data].[idx_datecoveringcol]), SEEK:([Metrics_Test_Specimen_Data].[DateStamp] > getdate()-'Sep 28 1902 12:00AM') ORDERED FORWARD)

I dont know why this weird date 1902 is apppering in my execution plan. I checked all my views & date limits. The oldest date that I have in any of my tables is from 2002.

Please take a look at the highlighted records 6, 29,34,55 in the attached spreadsheet. You will see the date.

Is it some kind of range or something? Do I need to worry about this?

-Jaya

Grant Fritchey SSC Guru Points: 398679 More actions · Answer 14

lallu_jaya (9/28/2010)
Chris,
I shall do it that way. I actually found that one of the join was redundant and removed it and found performance improvement.
Grant,
Thanks for your reply. Shall not worry abt Hash joins anymore

Wait one... Don't get me wrong, hash joins are not automatically bad, but they're not automatically good either. You have to evaluate the query being run, the data being retrieved, and where the hash is within the execution plan to understand if it's a good thing or a bad thing.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning