Blanking Out On How Joins Work

  • I'm better than this! I KNOW I'm better than this. Just the same I am hitting a brick wall, and I could use an extra set of eyes to point out where I'm getting this wrong.

    My project is to re-build a report that is currently written in MDX against a data cube, and to have the new one point to the database via normal SQL. None of the original documentation exists, so I am just building it from the ground up.

    I have started adding in tables to make sure I am getting the same universe that the live system (the MDX data cube version) shows I should get. I added 2 tables, and I was still getting everything. I add a 3rd table with a Left Join, and if I put a limit on that I end up short 854 records. When there is no restriction on the Left Join then I get everything I'm supposed to get.

    Have I just misunderstood how a Left Join works for the better part of 10 years? It shouldn't matter that the records do not show up on Table3 from the limit, because it is a left join. What gives?

    Here are two mildly edited (for public consumption) version of my 2 SQL scripts:

    Select TCH.Cas_ID

    ,TCCL.Cl_ID

    From dbo.TCas_History TCH

    Inner Join dbo.TCas_Client TCCL

    On TCH.Cas_ID = TCCL.Cas_ID

    Left Join dbo.TCas_Cntac TCC

    On TCC.Cas_ID = TCCL.Cas_ID

    Where TCH.Close_Dt Is Null

    And TCCL.Child_SW = 'Y';

    The above script does exactly what I expect it to.

    Select TCH.Cas_ID

    ,TCCL.Cl_ID

    From dbo.TCas_History TCH

    Inner Join dbo.TCas_Client TCCL

    On TCH.Cas_ID = TCCL.Cas_ID

    Left Join dbo.TCas_Cntac TCC

    On TCC.Cas_ID = TCCL.Cas_ID

    Where TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)

    And TCH.Close_Dt Is Null

    And TCCL.Child_SW = 'Y';

    Adding in the line testing TCC.Cntac_Dt causes me to lose 854 records.

    What am I missing in my logic?:angry:

    [font="Arial"]“Any fool can know. The point is to understand.”
    - Albert Einstein

    "DOH!"
    - Homer Simpson[/font]

  • jarid.lawson (11/10/2015)


    I'm better than this! I KNOW I'm better than this. Just the same I am hitting a brick wall, and I could use an extra set of eyes to point out where I'm getting this wrong.

    My project is to re-build a report that is currently written in MDX against a data cube, and to have the new one point to the database via normal SQL. None of the original documentation exists, so I am just building it from the ground up.

    I have started adding in tables to make sure I am getting the same universe that the live system (the MDX data cube version) shows I should get. I added 2 tables, and I was still getting everything. I add a 3rd table with a Left Join, and if I put a limit on that I end up short 854 records. When there is no restriction on the Left Join then I get everything I'm supposed to get.

    Have I just misunderstood how a Left Join works for the better part of 10 years? It shouldn't matter that the records do not show up on Table3 from the limit, because it is a left join. What gives?

    Here are two mildly edited (for public consumption) version of my 2 SQL scripts:

    Select TCH.Cas_ID

    ,TCCL.Cl_ID

    From dbo.TCas_History TCH

    Inner Join dbo.TCas_Client TCCL

    On TCH.Cas_ID = TCCL.Cas_ID

    Left Join dbo.TCas_Cntac TCC

    On TCC.Cas_ID = TCCL.Cas_ID

    Where TCH.Close_Dt Is Null

    And TCCL.Child_SW = 'Y';

    The above script does exactly what I expect it to.

    Select TCH.Cas_ID

    ,TCCL.Cl_ID

    From dbo.TCas_History TCH

    Inner Join dbo.TCas_Client TCCL

    On TCH.Cas_ID = TCCL.Cas_ID

    Left Join dbo.TCas_Cntac TCC

    On TCC.Cas_ID = TCCL.Cas_ID

    Where TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)

    And TCH.Close_Dt Is Null

    And TCCL.Child_SW = 'Y';

    Adding in the line testing TCC.Cntac_Dt causes me to lose 854 records.

    What am I missing in my logic?:angry:

    What happens if you change:

    Where TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)

    to:

    Where ((TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)( OR (TCC.Cntac_Dt IS NULL))



    Alvin Ramard
    Memphis PASS Chapter[/url]

    All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

    For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

  • The only thing I notice different between the two scripts is the check for date/time value in the second script. So that is where I would focus my troubleshooting.

    Running this:

    SELECT (CAST(CONVERT(VARCHAR, GETDATE() -120, 101) AS DATETIME)

    Returns:

    2015-07-13 00:00:00.00

    Is that what you are 'expecting' the comparison to be? I would hardcode that value into the query or a separate select to see what is returned...it's possible the code is returning the correct amount of data based on the date you are providing.

    -SQLBill

  • Alvin Ramard (11/10/2015)


    What happens if you change:

    Where TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)

    to:

    Where ((TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)( OR (TCC.Cntac_Dt IS NULL))

    That leaves me at 423 still missing.

    Am I wrong in thinking that it shouldn't matter though? A Left Join is supposed to pull everything on the left now matter what. I'm not missing anything when I have just the first 2 tables, or when I have all 3 tables with no limit on TCC. Because of that it should still return the full universe.

    [font="Arial"]“Any fool can know. The point is to understand.”
    - Albert Einstein

    "DOH!"
    - Homer Simpson[/font]

  • The [font="Courier New"]Where TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)[/font] clause is changing the LEFT JOIN into an INNER JOIN.

    I usually code it like this:

    Select TCH.Cas_ID

    ,TCCL.Cl_ID

    From dbo.TCas_History TCH

    Inner Join dbo.TCas_Client TCCL

    On TCH.Cas_ID = TCCL.Cas_ID

    Left Join dbo.TCas_Cntac TCC

    On TCC.Cas_ID = TCCL.Cas_ID

    And TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)

    Where TCH.Close_Dt Is Null

    And TCCL.Child_SW = 'Y';

    This is an article related to this problem: http://www.sqlservercentral.com/articles/T-SQL/93039/

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • SQLBill (11/10/2015)


    The only thing I notice different between the two scripts is the check for date/time value in the second script. So that is where I would focus my troubleshooting.

    Running this:

    SELECT (CAST(CONVERT(VARCHAR, GETDATE() -120, 101) AS DATETIME)

    Returns:

    2015-07-13 00:00:00.00

    Is that what you are 'expecting' the comparison to be? I would hardcode that value into the query or a separate select to see what is returned...it's possible the code is returning the correct amount of data based on the date you are providing.

    -SQLBill

    I tried both your way and checking for NULL values in TCC.Cntac_Dt. They returned the same values.

    [font="Arial"]“Any fool can know. The point is to understand.”
    - Albert Einstein

    "DOH!"
    - Homer Simpson[/font]

  • Your joins pull 'everything' based on the joins, but then the where clause 'kicks in' and eliminates some of the date because it doesn't meet the where condition.

    Basically, combine everything from these tables based on how I want them joined...oh, and once you've done that, get rid of any data that doesn't match the where condition.

    -SQLBill

  • Run this:

    Select TCH.Cas_ID

    ,TCCL.Cl_ID, TCC.Cntac_DT

    INTO #tmpTable

    From dbo.TCas_History TCH

    Inner Join dbo.TCas_Client TCCL

    On TCH.Cas_ID = TCCL.Cas_ID

    Left Join dbo.TCas_Cntac TCC

    On TCC.Cas_ID = TCCL.Cas_ID

    Where TCH.Close_Dt Is Null

    And TCCL.Child_SW = 'Y';

    That will give you all the data you are expecting.

    Then run these:

    SELECT COUNT(*)

    FROM #tmpTable

    SELECT COUNT(*)

    FROM #tmpTable

    WHERE TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)

    AND TCH.Close_Dt Is Null

    And TCCL.Child_SW = 'Y';

    The first will give you the number you are expecting, the second will give you the number based on the where condition from your second query. It all comes down to your query eliminating records that are older than Cast(Convert(Varchar,GetDate()-120,101) As DateTime). The first query of yours includes those records.

    -SQLBill

  • Ding, ding, ding, ding, ding...We Have A Winner!

    The Where TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime) clause is changing the LEFT JOIN into an INNER JOIN.

    I usually code it like this:

    Select TCH.Cas_ID

    ,TCCL.Cl_ID

    From dbo.TCas_History TCH

    Inner Join dbo.TCas_Client TCCL

    On TCH.Cas_ID = TCCL.Cas_ID

    Left Join dbo.TCas_Cntac TCC

    On TCC.Cas_ID = TCCL.Cas_ID

    And TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)

    Where TCH.Close_Dt Is Null

    And TCCL.Child_SW = 'Y';

    This is an article related to this problem: http://www.sqlservercentral.com/articles/T-SQL/93039/

    Luis C.

    General Disclaimer:

    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    Forum Etiquette: How to post data/code on a forum to get the best help

    Edited: Today @ 8:14:02 PM by Luis Cazares

    I've never had a left join react like that for me.

    If you're keeping score at home, here is the final query that returned what I was expecting:

    Select TCH.Cas_ID

    ,TCCL.Cl_ID

    Into #TempDataHolder2

    From dbo.TCas_History TCH

    Inner Join dbo.TCas_Client TCCL

    On TCH.Cas_ID = TCCL.Cas_ID

    Left Join dbo.TCas_Cntac TCC

    On TCC.Cas_ID = TCCL.Cas_ID

    And TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)

    Where TCH.Close_Dt Is Null

    And TCCL.Child_SW = 'Y';

    [font="Arial"]“Any fool can know. The point is to understand.”
    - Albert Einstein

    "DOH!"
    - Homer Simpson[/font]

  • An old article (one that I cut my teeth on :-D), but check out INF: How to Restrict the Inner Table of an ANSI Outer Join

  • As an aside what if you change

    TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)

    to

    TCC.Cntac_Dt >= Cast(DATEADD(DAY,-120, GetDate()) As Date)

  • jarid.lawson (11/10/2015)


    I've never had a left join react like that for me.

    Whenever using this syntax it will happen. You might have been very lucky for not encountering this problem before. You probably didn't have to filter on the outer table when using outer joins which is the reason that we find this problem.

    I'm sure that now you found it, you'll remember it.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • SoHelpMeCodd (11/11/2015)


    An old article (one that I cut my teeth on :-D), but check out INF: How to Restrict the Inner Table of an ANSI Outer Join

    Thank you. That was a great article.

    If you found this post in research for the same issue, please take a moment to read it. It is very bare-bones and direct to the issue at hand, and it explains it beautifully.

    [font="Arial"]“Any fool can know. The point is to understand.”
    - Albert Einstein

    "DOH!"
    - Homer Simpson[/font]

  • Luis Cazares (11/11/2015)


    jarid.lawson (11/10/2015)


    I've never had a left join react like that for me.

    Whenever using this syntax it will happen. You might have been very lucky for not encountering this problem before. You probably didn't have to filter on the outer table when using outer joins which is the reason that we find this problem.

    I'm sure that now you found it, you'll remember it.

    <Presses Like button>:-D

    [font="Arial"]“Any fool can know. The point is to understand.”
    - Albert Einstein

    "DOH!"
    - Homer Simpson[/font]

  • djj (11/11/2015)


    As an aside what if you change

    TCC.Cntac_Dt >= Cast(Convert(Varchar,GetDate()-120,101) As DateTime)

    to

    TCC.Cntac_Dt >= Cast(DATEADD(DAY,-120, GetDate()) As Date)

    Good call. I was just trying to take off the time, and my date calculation are still not as good as I wish they were. I believe that DateAdd is a function I'm going to be studying over the next little while.

    [font="Arial"]“Any fool can know. The point is to understand.”
    - Albert Einstein

    "DOH!"
    - Homer Simpson[/font]

Viewing 15 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply