How to avoid Nested Loop Join

  • I've a query like so:

    select distinct t1.f1 from t1 join t2 on t1.f2 = t2.f2

    join t3 on t1.f4 = t3.f4

    join t4 on t2.f5 = t4.f5

    where t3.dt is not null and t2.f9 = xxx

    it takes 6 min to execute. If i change select to do a select t1.* instead of select distinct t1.f1, it takes 3 seconds.

    When i look at the execution plans for both queries the difference is that in 1st query there is a nested loop join where there is a hash join in the 2nd.

    So if I change the query to:

    select distinct t1.f1 from t1 inner hash join t2 on t1.f2 = t2.f2

    join t3 on t1.f4 = t3.f4

    join t4 on t2.f5 = t4.f5

    where t3.dt is not null and t2.f9 = xxx

    The query takes 3 seconds.

    The problem is that this is dynamically generated SQL in an application and we don't want to use hints within it. I've tried updating stats on all tables in the joins, added covering indexes etc. nothing changed the plan. What can be done to force the plan to change from nested loop join to a hash join without using a hint explicitly?

    Thanks

  • Table definitions, index definitions and execution plan please.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I think the "Distinct" stinks..

    Try to rewriting the query using aggregate functions instead

    You need to look at the query plan..

    I have a personal pet hate using the Distinct keyword. Overly used in lazy programming..

  • If the requirement is to remove duplicate rows, use DISTINCT

    If the requirement is to produce aggregates with grouping, use GROUP BY

    Seeing Group By with no aggregates suggests that someone read the myth that group by is faster/better than DISTINCT and didn't do any tests.

    Specify the query in the simplest, most obvious way, if after testing performance is inadequate then consider alternate query forms and test them. If they're faster great, if not, go back to the simple form.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Do you need the aggregation of DISTINCT? Are there data problems you could fix instead of using DISTINCT? You might want to try attacking that. But, I'm with Gail, without seeing what's happening through the execution plan, there's not much to suggest here.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • I'm afraid I can't put table definitions and query plans here. I can send you via PM.

  • If you PM them to me I'll delete them (not a private consultant any longer). The reason we're asking for them to be posted here is so that anyone wandering by can help, not just the person you send the details to

    You can obfuscate both, replace the table and column names with generic names, run a search/replace over the plan to do the same (just make sure it still loads). We don't care about the names, we care about the operators, their properties and what that says about what SQL's doing

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply