query plan vs JOIN

  • anyone may know how query plan generated the query sequence when use INNER JOIN....which table will be picked as outer table...any input will be very appreciated.

    I have two tables as below

    CREATE TABLE t1 (id INT PRIMARY KEY, COL1 VARCHAR(100))

    CREATE TABLE T2 (id INT PRIMARY KEY, COL1 VARCHAR(100))

    GO

    INSERT INTO t1 VALUES(100, 'TEST1')

    INSERT INTO T2 VALUES (100, 'TEST2')

    GO

    SELECT*FROM T2 INNER JOIN T1 ON T2.ID =T1.ID WHERE T1.ID =100

    |--Nested Loops(Inner Join)

    |--Clustered Index Seek(OBJECT:([BESMgmt].[dbo].[t1].[PK__t1__6B8FCA2D]), SEEK:([BESMgmt].[dbo].[t1].[id]=(100)) ORDERED FORWARD)

    |--Clustered Index Seek(OBJECT:([BESMgmt].[dbo].[T2].[PK__T2__6D78129F]), SEEK:([BESMgmt].[dbo].[T2].[id]=(100)) ORDERED FORWARD)

    The query plan will use T1 as outer table.

    SELECT*FROM T1 INNER JOIN T2 ON T1.ID =T2.ID WHERE T1.ID =100

    |--Nested Loops(Inner Join)

    |--Clustered Index Seek(OBJECT:([BESMgmt].[dbo].[T2].[PK__T2__6D78129F]), SEEK:([BESMgmt].[dbo].[T2].[id]=(100)) ORDERED FORWARD)

    |--Clustered Index Seek(OBJECT:([BESMgmt].[dbo].[t1].[PK__t1__6B8FCA2D]), SEEK:([BESMgmt].[dbo].[t1].[id]=(100)) ORDERED FORWARD)

    The query plan will use T2 as outer table.

    why this happens? any suggestion?

    what i am guessing, SQL Server is picking up whatever it wants...looks like when there is not big different in the data volumn and distribution, and JOIN condition is JOIN clustered index from each table, and condition is filtering clustered index...SQL Server is choosing the last table in the JOIN... i am not sure if I can make this assumption. In a high concurrent environment, which one has been chosen as outer table is critical for resolving blocking and deadlock... if we can get confirmation for this assumption, instead of using force order hint... we can easily switch the join sequence to make T1 as last table in the JOIN, if I always want T1 as outer table......any idea?

  • why this happens? any suggestion?

    what i am guessing, SQL Server is picking up whatever it wants...looks like when there is not big different in the data volumn and distribution, and JOIN condition is JOIN clustered index from each table, and condition is filtering clustered index...SQL Server is choosing the last table in the JOIN... i am not sure if I can make this assumption. In a high concurrent environment, which one has been chosen as outer table is critical for resolving blocking and deadlock... if we can get confirmation for this assumption, instead of using force order hint... we can easily switch the join sequence to make T1 as last table in the JOIN, if I always want T1 as outer table......any idea?

    This is a new thing Management Studio does because INNER JOIN comes with performance improvements and other things that are not in OUTER JOIN because of the default mathematical NULL in OUTER JOIN. You could still write your own code, you should know it is not a random behavior it may be instructions in Management Studio to create optimized queries and INNER JOIN can be optimized many ways while some OUTER JOIN cannot be optimized.

    Kind regards,
    Gift Peddie

  • I don't know this for sure, and I think someone like Grant Fritchey might be better equipped to answer this question, but I'll give it a shot.

    In the example you are giving it really doesn't matter because you have 1 row in each table so choosing either one will work fine. But, with larger sets of data, I believe SQL Server is going to choose the table that returns the fewest rows based on the criteria.

  • First thing to note is that 1 row in each table is not enough for any meaningful conclusions. You really need a couple hundred at least.

    As far as I'm aware, with loop joins, the ideal is that the outer table has a small number of rows (because the loop runs once for each row in the outer table) and the inner table seekable on the join predicate. This is, of course, the ideal situation.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Yep. Gail nailed it (shocking I'm sure).

    With absolutely nothing to differentiate them, the engine is going to pick one... it has to. But you never know, there might be a slight difference in the distribution of the indexes or something that the engine used to decide that, all other things being equal, one of the tables belonged on the outer side of the join. It doesn't really matter with such a small data set.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Grant Fritchey (11/23/2009)


    Yep. Gail nailed it (shocking I'm sure).

    Just finished writing a blog post on 'optimal' join types. Virtually a copy-paste from blog to forum.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (11/23/2009)


    Grant Fritchey (11/23/2009)


    Yep. Gail nailed it (shocking I'm sure).

    Just finished writing a blog post on 'optimal' join types. Virtually a copy-paste from blog to forum.

    Excellent. I'm still catching up on my blog post readings. I'll get around to it.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply