Complex Join

  • I have to tables T1 and T2 that are related by two columns C1 and C2.  Both columns are null-able so it's not as simple as:

    T1 join T2 
    on T1.C1 = T2.C1
    and T1.C2 = T2.C2

    So I've come up with this:

    T1 join T2 
    on (
    T2.C1 = T1.C1 and T1.C2 is null
    )
    or (
    T2.C2 = T1.C2 and T1.C1 is null
    )
    or (
    T2.C1 = T1.C1 and T2.C2 = T1.C2
    )
    )

    I gotta believe there's an easier way, but it escapes me.

    thanx in advance, d

    • This topic was modified 3 years, 7 months ago by  DonlSimpson.

    Don Simpson



    I'm not sure about Heisenberg.

  • I would think you'd want this:

    T1 join T2

    on ((T1.C1 = T2.C1) OR (T1.C1 IS NULL AND T2.C1 IS NULL))

    and ((T1.C2 = T2.C2) OR (T1.C2 IS NULL AND T2.C2 IS NULL))

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Unfortunately, either column can be null in one table and not null in the other.  When I said "related by," I didn't mean to imply that there's an FK relationship.  It's related data used by a variety of reports.

    Don Simpson



    I'm not sure about Heisenberg.

  • So you want a NULL in the column to match *every* column on the other table that isn't NULL?  I.e., NULL is like a wildcard match?

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Yes, that would be true.

    Don Simpson



    I'm not sure about Heisenberg.

  • Then your original JOIN is fine and likely the cleanest way to do the join.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • A bit shorter version:

    T1 join T2 
    on (
    T2.C1 = T1.C1 OR T1.C1 is null
    )
    AND (
    T2.C2 = T1.C2 OR T1.C2 is null
    )

    When you express it this way it becomes obvious that records in T1 with NULL in both C1 and C2 will be joined to every record in T2. Is it what is expected?

    _____________
    Code for TallyGenerator

  • It's not what is expected, and the data condition won't occur.  Each column is null-able, but both cannot be null.

    Don Simpson



    I'm not sure about Heisenberg.

  • DonlSimpson wrote:

    It's not what is expected, and the data condition won't occur.  Each column is null-able, but both cannot be null.

    You should post some readily consumable test data and the desired results and let's find out if that's true or not. 😉  Please see the article at the first link in my signature line below.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • /*
    drop table leftTable;
    drop table rightTable;
    */
    create table leftTable (C1 int, C2 int, C3 varchar(20));
    create table rightTable (C1 int, C2 int, C3 varchar(20));

    insert into leftTable values (1, 1, 'leftTable 1');
    insert into leftTable values (2, 2, 'leftTable 2');
    insert into leftTable values (2, 9, 'leftTable 3');
    insert into leftTable values (3, 1, 'leftTable 4');
    insert into leftTable values (4, null, 'leftTable 5');

    insert into rightTable values (1, null, 'rightTable 1'); -- match (leftTable.C1 = rightTable.C1 and rightTable.C2 is null)
    insert into rightTable values (null, 1, 'rightTable 2'); -- match x2 (leftTable.C2 = rightTable.C2 and rightTable.C1 is null)
    insert into rightTable values (1, 1, 'rightTable 3'); -- match (leftTable.C1 = rightTable.C1 and leftTable.C2 = rightTable.C2)
    insert into rightTable values (null, null, 'rightTable 4'); -- no match

    insert into rightTable values (1, 2, 'rightTable 5'); -- no match
    insert into rightTable values (2, 1, 'rightTable 6'); -- no match
    insert into rightTable values (2, null, 'rightTable 7'); -- match x2
    insert into rightTable values (4, null, 'rightTable 8'); -- match
    insert into rightTable values (4, 1, 'rightTable 9'); -- no match
    insert into rightTable values (null, 4, 'rightTable A'); -- no match

    commit;

    select *
    from leftTable
    join rightTable
    on (
    leftTable.C1 = rightTable.C1 and rightTable.C2 is null
    )
    or (
    leftTable.C2 = rightTable.C2 and rightTable.C1 is null
    )
    or (
    leftTable.C1 = rightTable.C1 and leftTable.C2 = rightTable.C2
    )
    order by leftTable.C1, leftTable.C2, rightTable.C1, rightTable.C2;

    Don Simpson



    I'm not sure about Heisenberg.

  • Thanks for taking the time to post the readily consumable data.

    I agree... Sergiy's shortened version doesn't work the same as the code you posted.  I've not worked with Boolean Algebra since 1979 and so any other solution is eluding me.  Or, maybe it's not and there is no other solution.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Still there is a shorter version:

    select * 
    from leftTable T1
    join rightTable T2
    on (
    T1.C1 = T2.C1 AND (ISNULL(T1.C2, T2.C2) = ISNULL(T2.C2, T1.C2))
    )
    OR (
    T2.C2 = T1.C2 AND (ISNULL(T1.C1, T2.C1) = ISNULL(T2.C1, T1.C1))
    )

    Whoo-hoo!

    🙂

    _____________
    Code for TallyGenerator

  • At least I know it's not something trivial that I overlooked.  🙂

    • This reply was modified 3 years, 7 months ago by  DonlSimpson.

    Don Simpson



    I'm not sure about Heisenberg.

  • You never want to use ISNULL() in a WHERE or JOIN clause.  The code may be slightly longer the other way, but it could potentially run much faster.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Sergiy wrote:

    Still there is a shorter version:

    select * 
    from leftTable T1
    join rightTable T2
    on (
    T1.C1 = T2.C1 AND (ISNULL(T1.C2, T2.C2) = ISNULL(T2.C2, T1.C2))
    )
    OR (
    T2.C2 = T1.C2 AND (ISNULL(T1.C1, T2.C1) = ISNULL(T2.C1, T1.C1))
    )

    Whoo-hoo!

    🙂

    I came up with something similar prior to my previous post and rejected it as a solution on my part because of the ISNULL's.  It would be interesting to see if the original code or this codes fairs better on a large, properly indexed pair of tables.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply