Avoid LEFT join

  • npranj

    SSCrazy

    Points: 2323

    I am facing issues with a LEFT JOIN in my query. It takes 45 secs to process on the production server due to huge number of records. Need help in building a query to avoid the LEFT JOIN. I am Trying to use UNION ALL and it works much faster except that I am stuck in the last bit.

    scripts (sample):

    CREATE TABLE [dbo].[tbl_PersonDetails](

    [PersonID] [int] NOT NULL,

    [LeaveTimeId] [int] NOT NULL

    ) ON [PRIMARY]

    GO

    CREATE TABLE [dbo].[tbl_PersonLeaveDetails](

    [PersonId] [int] NOT NULL,

    [LeaveFromTimeID] [int] NULL,

    [LeaveToTimeID] [int] NULL

    ) ON [PRIMARY]

    GO

    INSERT [dbo].[tbl_PersonLeaveDetails] ([PersonId], [LeaveFromTimeID], [LeaveToTimeID]) VALUES (1, 5, 11)

    INSERT [dbo].[tbl_PersonLeaveDetails] ([PersonId], [LeaveFromTimeID], [LeaveToTimeID]) VALUES (2, 12, 15)

    INSERT [dbo].[tbl_PersonDetails] ([PersonID], [LeaveTimeId]) VALUES (1, 10)

    INSERT [dbo].[tbl_PersonDetails] ([PersonID], [LeaveTimeId]) VALUES (2, 8)

    INSERT [dbo].[tbl_PersonDetails] ([PersonID], [LeaveTimeId]) VALUES (3, 9)

    INSERT [dbo].[tbl_PersonDetails] ([PersonID], [LeaveTimeId]) VALUES (1, 4)

    Requirement:

    --------------------

    Need Rows from tbl_PersonDetails macthing (all 3 below) following criteria :

    1. tbl_PersonDetails.PersonID is present in tbl_PersonLeaveDetails

    2.tbl_PersonDetails.TimeID does not fall between any of the aligned (matching personid) FromTimeID and ToTimeID in tbl_PersonLeaveDetails.

    3. not using LEFT join

    so in this case for example.. need -

    tbl_PersonDetails table .... record, 1,4

  • kapil_kk

    SSC-Insane

    Points: 21316

    try this_

    select t1.PersonID, t1.LeaveTimeId

    FROM tbl_PersonDetails t1

    JOIN tbl_PersonLeaveDetails t2 ON t1.PersonID = t2.PersonId

    WHERE t1.LeaveTimeId NOT IN (select t1.LeaveTimeId FROM tbl_PersonLeaveDetails t2 WHERE t1.LeaveTimeId BETWEEN t2.LeaveFromTimeID AND t2.LeaveToTimeID)

    _______________________________________________________________
    To get quick answer follow this link:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • npranj

    SSCrazy

    Points: 2323

    Thanks Kapil. It wont work because we need a join with the ID. NEed to check TimeId against each Person ID.

    Got the query. IT was simple. -

    select PDD.*

    from dbo.tbl_PersonLeaveDetails LD

    inner join dbo.tbl_PersonDetails PDD

    on LD.personid = PDD.PersonID

    where PDD.LeaveTimeId not between ld.leavefromtimeid and ld.leavetotimeid

  • kapil_kk

    SSC-Insane

    Points: 21316

    npranj (8/20/2013)


    Thanks Kapil. It wont work because we need a join with the ID. NEed to check TimeId against each Person ID.

    Got the query. IT was simple. -

    select PDD.*

    from dbo.tbl_PersonLeaveDetails LD

    inner join dbo.tbl_PersonDetails PDD

    on LD.personid = PDD.PersonID

    where PDD.LeaveTimeId not between ld.leavefromtimeid and ld.leavetotimeid

    As per your requirement I put a join on personID only then for the matched personID I checked the LeaveTimeID from tbl_personleavedetails...

    After runnig my query I got the expected result as you mentioned:

    1,4 from tbl_persondetails..

    _______________________________________________________________
    To get quick answer follow this link:
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

  • Dwain Camps

    SSC Guru

    Points: 86893

    I can't say I really understand your requirements enough to give it a shot.

    For example, shouldn't this row be spit out of your expected results because 8 is not between 11 and 15?

    INSERT [dbo].[tbl_PersonDetails] ([PersonID], [LeaveTimeId]) VALUES (2, 8)

    For what it's worth:

    1. You probably need some indexing but difficult to suggest what until I understand your requirements.

    2. LEFT JOIN is probably not the problem.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • HanShi

    SSC-Dedicated

    Points: 33324

    According to your stated requirements, record (2, 8) from table [tbl_PersonDetails] should also be in your resultset. The value 8 doesn't fall between 12 and 15.

    This code uses an INNER JOIN and produces the results according to your requirements:

    select [tbl_PersonDetails].[PersonID], [LeaveTimeId], [LeaveFromTimeID], [LeaveToTimeID]

    from [tbl_PersonDetails]

    inner join [tbl_PersonLeaveDetails]

    on [tbl_PersonDetails].[PersonID] = [tbl_PersonLeaveDetails].[PersonID]

    where [LeaveTimeId] NOT BETWEEN [LeaveFromTimeID] AND [LeaveToTimeID]

    Edit: I see you allready came to the same query 😉

    ** Don't mistake the ‘stupidity of the crowd’ for the ‘wisdom of the group’! **
  • ChrisM@Work

    SSC Guru

    Points: 186094

    npranj (8/20/2013)


    I am facing issues with a LEFT JOIN in my query. It takes 45 secs to process on the production server due to huge number of records. Need help in building a query to avoid the LEFT JOIN. I am Trying to use UNION ALL and it works much faster except that I am stuck in the last bit.

    ...

    Can you post the estimated execution plan of the slow query as a .sqlplan attachment? It makes this kind of problem far easier to address.

    [font="Arial"]“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw[/font]


    For fast, accurate and documented assistance in answering your questions, please read this article[/url].
    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]
    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
    [url

  • npranj

    SSCrazy

    Points: 2323

    Hi Chris.. am afraid will not be able to put the execution plan of the query here due to confidentiality issues.

    However, the exec plan had all Index Seeks and one Index Scan. The Index Scan was on the table which I was using to JOIN (LEFT). So the main table was feteching 30k records out of over 20 million records and the table joined LEFT had only 10 matching rows.

    I derived another way of removing the LEFT join.

    I created a temp table to populate all records from the other tables (30k) and then updated the temp table for those 10 matching rows using inner join. It works now - thanks !

  • ChrisM@Work

    SSC Guru

    Points: 186094

    npranj (8/21/2013)


    Hi Chris.. am afraid will not be able to put the execution plan of the query here due to confidentiality issues.

    However, the exec plan had all Index Seeks and one Index Scan. The Index Scan was on the table which I was using to JOIN (LEFT). So the main table was feteching 30k records out of over 20 million records and the table joined LEFT had only 10 matching rows.

    I derived another way of removing the LEFT join.

    I created a temp table to populate all records from the other tables (30k) and then updated the temp table for those 10 matching rows using inner join. It works now - thanks !

    Understood - and thanks for putting up some descriptions in lieu.

    You're really jumping through hoops with your final suggestion above. Generally, we jump through hoops after trying more orthodox methods. In this case, the more orthodox method is to create an index (covering if necessary) on the left-joined table to support the join and if possible any other filters on that table.

    [font="Arial"]“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw[/font]


    For fast, accurate and documented assistance in answering your questions, please read this article[/url].
    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]
    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
    [url

  • npranj

    SSCrazy

    Points: 2323

    Yes thanks for highlighting that. I did check the indexes on the table. There is a covering index in place already. I was getting an index scan on that covering index itself. Hence was thinking in lines of avoiding left join and using one of the two options -

    1. Break the join and use UNION ALL. [Ruled it out coz there was business logic involved when i tried to break the query. I would have had to add multiple statements to cover all scenario].

    2. Insert into temp table and then update specific records. [This worked as it was slightly simpler will no scope of missing out records)].

    In the lower environment when I try running the updated query (insert into temp table and update specific records) - it takes ~6 secs whereas the original query with left join is taking ~ 1min 50 secs. Hence I closed in on this approach to re-write the query.

  • ChrisM@Work

    SSC Guru

    Points: 186094

    npranj (8/21/2013)


    Yes thanks for highlighting that. I did check the indexes on the table. There is a covering index in place already. I was getting an index scan on that covering index itself. Hence was thinking in lines of avoiding left join and using one of the two options -

    1. Break the join and use UNION ALL. [Ruled it out coz there was business logic involved when i tried to break the query. I would have had to add multiple statements to cover all scenario].

    2. Insert into temp table and then update specific records. [This worked as it was slightly simpler will no scope of missing out records)].

    In the lower environment when I try running the updated query (insert into temp table and update specific records) - it takes ~6 secs whereas the original query with left join is taking ~ 1min 50 secs. Hence I closed in on this approach to re-write the query.

    An index scan on the covering index suggests that it's being used by the query for this join because, with fewer columns than the table, it's cheaper than a clustered index scan (table scan). You may benefit from a new index or a change to this index. Check the seek and residual predicates of the index scan operator. Assuming the operator for the left join is a Nested Loops (Left Outer Join) operator, how many rows are on the top and bottom input?

    Just as a matter of interest, if you comment out the left join (and the columns in the SELECT list), how long does the query take?

    [font="Arial"]“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw[/font]


    For fast, accurate and documented assistance in answering your questions, please read this article[/url].
    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]
    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
    [url

  • npranj

    SSCrazy

    Points: 2323

    Top Input : 30,200 rows

    Bottom Input: 8,05,000 rows (left join input)

    There is a HASH MATCH (left outer)

    And when I comment out the LEFT JOIN (and its aligned columns), the original query takes 3 secs (instead of 1 min 57 secs) on lower environment.

  • HelloHai

    SSC Rookie

    Points: 33

    nil

  • ChrisM@Work

    SSC Guru

    Points: 186094

    mohanaprabhu.v (6/4/2015)


    SELECT tblInvoice.*, tblCustomer.FirstName + SPACE(1) +

    tblCustomer.LastName as CustomerName FROM tblInvoice WITH(NOLOCK)

    LEFT JOIN tblCustomer WITH(NOLOCK) ON tblInvoice.CustomerNo = tblCustomer.CustomerNo

    please help us to improve this query performance level without affecting the existing functionality

    Sure. Start a new thread, and include the Actual Execution Plan as an attachment.

    If you want all columns from your invoice table, your options for tuning this query will be limited.

    [font="Arial"]“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw[/font]


    For fast, accurate and documented assistance in answering your questions, please read this article[/url].
    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]
    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]
    [url

  • HelloHai

    SSC Rookie

    Points: 33

    Yes i want all columns from invoice table,

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply