Difference between 'hash match/inner join' and 'nested loops/inner join' in execution query plan?

Question

Post reply

Difference between 'hash match/inner join' and 'nested loops/inner join' in execution query plan?

Paresh Prajapati

SSCertifiable

Points: 7567
More actions
January 1, 2008 at 5:54 am

#119512

Difference between 'hash match/inner join' and 'nested loops/inner join' in execution query plan?
what is meaning of them?
when it is used in query plan?
_____________________________________________________________________________________________________________
Paresh Prajapati
➡ +919924626601
http://paresh-sqldba.blogspot.com/[/url]
LinkedIn | Tweet Me | FaceBook | Brijj

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 1

Forgive me for again directing you elsewhere, but I wrote on this just a couple days ago

http://sqlinthewild.co.za/index.php/2007/12/30/execution-plan-operations-joins/[/url]

Again, if you have more questions, I'll be very happy to answer them.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1003851 More actions · Answer 2

Nice blog, Gail...great explanations!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 3

A nested loop query plan is usually optimal when there are small numbers of rows in one table (think 10s to perhaps 100s in most cases) that can be probed into another table that is either very small or has an index that allows a seek for each input. One thing to watch out for here is when the optimizer THINKS there will be few rows (check the estimated rows in the popup for the estimated query plan graphic) but in reality there are LOTS of rows (thousands or even millions). This is one of the worst things that can happen to a query plan and is usually the result of either out-of-date statistics, a cached query plan or data that is unevenly distributed. Each of these can be addressed to minimize the likelyhood of the problem.

A hash-match plan is optimal when there is a relatively few rows joined into a relatively large number of rows. Gail's post explains the basics of the mechanism. It can get REALLY slow on machines with insufficient buffer RAM to contain the hash tables in memory because they will have to be laid down to disk at a HUGE relative cost.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 4

Good point. Thanks.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Tim Mitchell SSCoach Points: 15681 More actions · Answer 5

TheSQLGuru (1/2/2008)
A nested loop query plan is usually optimal when there are small numbers of rows in one table (think 10s to perhaps 100s in most cases) that can be probed into another table that is either very small or has an index that allows a seek for each input. One thing to watch out for here is when the optimizer THINKS there will be few rows (check the estimated rows in the popup for the estimated query plan graphic) but in reality there are LOTS of rows (thousands or even millions). This is one of the worst things that can happen to a query plan and is usually the result of either out-of-date statistics, a cached query plan or data that is unevenly distributed. Each of these can be addressed to minimize the likelyhood of the problem.
A hash-match plan is optimal when there is a relatively few rows joined into a relatively large number of rows. Gail's post explains the basics of the mechanism. It can get REALLY slow on machines with insufficient buffer RAM to contain the hash tables in memory because they will have to be laid down to disk at a HUGE relative cost.

I know this is an old thread, but it seems to address the problem I'm facing now. I'll preface this description by saying that I deal mostly in BI, so you won't hurt my feelings if you point out a fundamental error in my performance tuning logic 😉

I've got a relational query run from SSRS that seems to be generating an inefficient execution plan. As part of the query, I'm joining an 18m row table with a 2k row table, and the resulting join shows up as a Nested Loop in my execution plan. Statistics on both tables are up to date, I've dumped the plan cache, and the columns on which the tables are joined are both indexed and are of the same data type (VARCHAR(7)).

My research tells me that a nested loop is best for small data sets only, and that a hash join might be a better performing method. I think the nested loop might be chosen because the expected number of rows (1) is wildly different from the actual number of rows (2.5 million).

So the question is, is there anything else I can do to help the query optimizer choose a different join structure? Further, are there any suggestions to improve the affinity between the expected and actual number of rows?

By the way, I'm checking with my client to make sure they have no concerns about uploading their execution plan to this thread.

Thanks,

Tim

Tim Mitchell, Microsoft Data Platform MVP
Data Warehouse and ETL Consultant
TimMitchell.net | @Tim_Mitchell | Tyleris.com
ETL Best Practices

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 6

I suppose the important thing here is why the mismatch in row estimates.

Can you maybe start a new thread and post the query in question and the execution plan please?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

chndn.engg Valued Member Points: 72 More actions · Answer 7

I want to understand HASH MATCH. What is Procession of it.

Gail Shaw SSC Guru Points: 1004504 More actions · Answer 8

Did you read the blog post that was given in this thread? Did you read the post it links to?

If you did and still have questions, please start a new thread and ask specific questions.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass