Joining large tables to small tables.

  • Hi

    I hope this is an easy question to answer. 🙂

    I'm joining a large table to a small table and  results in an index scan on the large table, returning 1 million rows, which is then hash joined to the 2 rows returned by the small table

    select ST.col1, ST.col2, .ST.col3. LT.col4 
    from SmallTable as ST
    inner join LargeTable as LT ST.col1 = LT.col1
    where ST.col2 = 6

    Is there a better way to join these tables, so there isn't a table scan on the large table?
    Cheers
    Alex

  • alex.palmer - Tuesday, July 24, 2018 2:32 AM

    Hi

    I hope this is an easy question to answer. 🙂

    I'm joining a large table to a small table and  results in an index scan on the large table, returning 1 million rows, which is then hash joined to the 2 rows returned by the small table

    select ST.col1, ST.col2, .ST.col3. LT.col4 
    from SmallTable as ST
    inner join LargeTable as LT ST.col1 = LT.col1
    where ST.col2 = 6

    Is there a better way to join these tables, so there isn't a table scan on the large table?
    Cheers
    Alex

    Have you considered putting a supporting index on the large table? Say, the join column as key column, with any other columns referenced by the query in the INCLUDE section?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Yep! I've done exactly that and I still get a table scan

  • alex.palmer - Tuesday, July 24, 2018 3:35 AM

    Yep! I've done exactly that and I still get a table scan

    Can you attach the ACTUAL execution plan as a .sqlplan file attachment please?

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi

    sorry I've wasted your time.  I looked at the column definitions and the data types where different.

    Once I casted the nchar to a char and the data types matched I got the expected scan

    Cheers

    Alex

  • alex.palmer - Tuesday, July 24, 2018 4:31 AM

    Hi

    sorry I've wasted your time.  I looked at the column definitions and the data types where different.

    Once I casted the nchar to a char and the data types matched I got the expected scan

    Cheers

    Alex

    Wouldn't you expect a SEEK as opposed to a scan?

  • sgmunson - Thursday, July 26, 2018 11:26 AM

    alex.palmer - Tuesday, July 24, 2018 4:31 AM

    Hi

    sorry I've wasted your time.  I looked at the column definitions and the data types where different.

    Once I casted the nchar to a char and the data types matched I got the expected scan

    Cheers

    Alex

    Wouldn't you expect a SEEK as opposed to a scan?

    Depends. Might be a range scan using the index, which can be more efficient than a seek.

  • sorry I've wasted your time. I looked at the column definitions and the data types where different.

    Is there a reason you didn't use a foreign key reference?  This would have come up when you tried to establish it.  Also doing this, if it's appropriate, gives the analyzer additional information about the relationship of the data.

  • andycadley - Thursday, July 26, 2018 12:18 PM

    sgmunson - Thursday, July 26, 2018 11:26 AM

    alex.palmer - Tuesday, July 24, 2018 4:31 AM

    Hi

    sorry I've wasted your time.  I looked at the column definitions and the data types where different.

    Once I casted the nchar to a char and the data types matched I got the expected scan

    Cheers

    Alex

    Wouldn't you expect a SEEK as opposed to a scan?

    Depends. Might be a range scan using the index, which can be more efficient than a seek.

    IIRC, I'm pretty sure that range scans start with a seek.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • andycadley - Thursday, July 26, 2018 12:18 PM

    sgmunson - Thursday, July 26, 2018 11:26 AM

    alex.palmer - Tuesday, July 24, 2018 4:31 AM

    Hi

    sorry I've wasted your time.  I looked at the column definitions and the data types where different.

    Once I casted the nchar to a char and the data types matched I got the expected scan

    Cheers

    Alex

    Wouldn't you expect a SEEK as opposed to a scan?

    Depends. Might be a range scan using the index, which can be more efficient than a seek.

    That would show up as an index seek. Anything that searches down the index tree for a particular value is considered a seek.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Whenever you do a left join a small table and a large table make sure that large table is defined first in the join query. In your case the query can be optimised by simly doing:

     

    select ST.col1, ST.col2, .ST.col3. LT.col4

    from LargeTable as LT

    inner join SmallTable as ST

    where ST.col1 = LT.col1

    and ST.col2 =  6

  • satyanarayana09 wrote:

    Whenever you do a left join a small table and a large table make sure that large table is defined first in the join query. In your case the query can be optimised by simly doing:

    select ST.col1, ST.col2, .ST.col3. LT.col4 from LargeTable as LT inner join SmallTable as ST

    where ST.col1 = LT.col1 and ST.col2 =  6

    First, the question was about an INNER join, not a LEFT join.  The two types of joins are VERY different.

    Second, changing the order of tables in a LEFT join can completely change the data that is returned.  This is a much more important factor than the efficiency of the join.

    Third, I don't believe that the order of the tables in an INNER JOIN has any effect on the efficiency of that join.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen wrote:

    satyanarayana09 wrote:

    Whenever you do a left join a small table and a large table make sure that large table is defined first in the join query. In your case the query can be optimised by simly doing:

    select ST.col1, ST.col2, .ST.col3. LT.col4 from LargeTable as LT inner join SmallTable as ST

    where ST.col1 = LT.col1 and ST.col2 =  6

    First, the question was about an INNER join, not a LEFT join.  The two types of joins are VERY different.

    Second, changing the order of tables in a LEFT join can completely change the data that is returned.  This is a much more important factor than the efficiency of the join.

    Third, I don't believe that the order of the tables in an INNER JOIN has any effect on the efficiency of that join.

    Drew

    Yes, the optimiser decides on the order to join tables it has nothing to do with the order they are specified in the query unless you use a FORCE ORDER hint.

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply