Conditional Join between two tables

  • Hi SQL Folks,

    I have two tables tabA (cola1, cola2, cola3) and tabB(colb1, colb2, colb3, colb4) which I need to join on all 3 columns of table A.

    Of the 3 columns in tabA, few can be NULL, in that case I want to check the joining condition for the rest of the columns, so its conditional joining. Let me rephrase what I am trying to acheive - I need to check if the columns in joining condition is NULL in my 1st table (tabA), If so I need to check the joining condition for the rest of the two columns, if 2nd column is again NULL, I need to check the joining condition on the third column.

    What I am trying to do is as below. Its working, but is very slow when one of the tables is huge. Can I optimize it or rewrite in a better way ?

    --- First Create two tables

    Create table tabA

    (cola1 nvarchar(100), cola2 nvarchar(100), cola3 nvarchar(100))

    Insert into tabA values (NULL,'A1','A2')

    Select * from tabA

    create table tabB

    (colb1 nvarchar(100), colb2 nvarchar(100), colb3 nvarchar(100), colb4 nvarchar(100))

    Insert into tabB values ('B1','B2','B3','B4')

    Select * from tabB

    ------- Joining the two ---

    Select A.cola1,B.*

    from tabA as A INNER JOIN tabB as B

    ON

    (select case when (A.cola1) is null THEN 'a' else B.colb1 end)= (Select case When A.cola1 is null then 'a' else B.colb1 end )and

    (select case when (A.cola2) is null THEN 'a' else B.colb2 end)= (Select case When A.cola2 is null then 'a' else B.colb2 end )and

    (select case when (A.cola3) is null THEN 'a' else B.colb3 end)= (Select case When A.cola3 is null then 'a' else B.colb3 end )

  • By slow I mean its taking more than 10 minutes which is not acceptable as this would form the query for the dataset of a ssrs report.

  • Quick question, can you supply some more details, the data and query posted need a little bit more explanation as the result is simply a Cartesian produce of the two tables.

    😎

    BTW, have a look at Dwain Camps article How to Avoid Conditional JOINs in T-SQL[/url]?

  • The first table - tabA is being populated by Users and the columns are Nullable. So few records might have 1 NULL value, other records might have two NULL values etc.

    cola1, cola2, Cola3

    NULL, aa, abc

    acd, NULL, sr

    acc, NULL, NULL

    acb, cbf, cbf

    This table needs to be join with the other table tabB in Database which would have values for all these columns and I need to join the two with whatever available non-null columns I have in tabA. May be I am repeating what I have already mentioned in my first post on this thread but this is what I need to do.

    Select <col list> from tabA A INNER JOIN tabB B

    ON

    A.cola1=B.colb1 and -- this condition shouldn't participate if A.cola1 IS NULL

    A.cola2=B.colb2 and -- this condition shouldn't participate if A.cola2 IS NULL

    A.cola3=B.colb3 -- this condition shouldn't participate if A.cola3 IS NULL

    Thanks again.

  • Two possible solutions

    😎

    USE tempdb;

    GO

    SET NOCOUNT ON;

    --- First Create two tables

    IF OBJECT_ID(N'dbo.tabA') IS NOT NULL DROP TABLE dbo.tabA;

    CREATE TABLE dbo.tabA

    (

    cola1 NVARCHAR(100) NULL

    ,cola2 NVARCHAR(100) NULL

    ,cola3 NVARCHAR(100) NULL

    );

    INSERT INTO dbo.tabA (cola1,cola2,cola3)

    VALUES ( NULL, 'A1', 'A2')

    ,( 'A0', NULL, 'A2')

    ,( 'A0', 'A1', NULL)

    ,( 'A0', 'A1', 'A2')

    IF OBJECT_ID(N'dbo.tabB') IS NOT NULL DROP TABLE dbo.tabB;

    CREATE TABLE dbo.tabB

    (

    colb1 NVARCHAR(100) NOT NULL

    ,colb2 NVARCHAR(100) NOT NULL

    ,colb3 NVARCHAR(100) NOT NULL

    ,colb4 NVARCHAR(100) NOT NULL

    );

    INSERT INTO dbo.tabB (colb1,colb2,colb3,colb4)

    VALUES ( 'B1', 'B2', 'B3', 'B4')

    ,( 'A0', 'A1', 'A2', 'THIS IS IT')

    ,( 'C9', 'B2', 'B3', 'B4')

    -- IN THIS CASE THERE IS NO USE ADDING AN INDEX TO tabA, IT WILL NOT BE USED.

    CREATE NONCLUSTERED INDEX NCLIDX_DBO_TABB_COLB1_COLB2_COLB3_INCL_COLB4 ON dbo.tabB (colb1 ASC,colb2 ASC,colb3 ASC) INCLUDE (colb4);

    -- TWO METHODS

    SELECT

    TA.cola1

    ,TA.cola2

    ,TA.cola3

    ,TB.colb1

    ,TB.colb2

    ,TB.colb3

    ,TB.colb4

    FROM dbo.tabA TA

    INNER JOIN dbo.tabB TB

    ON TB.colb1 = ISNULL(TA.cola1,TB.colb1)

    AND TB.colb2 = ISNULL(TA.cola2,TB.colb2)

    AND TB.colb3 = ISNULL(TA.cola3,TB.colb3)

    ;

    ----------------

    SELECT

    TA.cola1

    ,TA.cola2

    ,TA.cola3

    ,TB.colb1

    ,TB.colb2

    ,TB.colb3

    ,TB.colb4

    FROM dbo.tabA TA

    CROSS APPLY dbo.tabB TB

    WHERE CASE WHEN TA.cola1 IS NULL THEN TB.colb1 ELSE TA.cola1 END = TB.colb1

    AND CASE WHEN TA.cola2 IS NULL THEN TB.colb2 ELSE TA.cola2 END = TB.colb2

    AND CASE WHEN TA.cola3 IS NULL THEN TB.colb3 ELSE TA.cola3 END = TB.colb3

    ;

    Results

    cola1 cola2 cola3 colb1 colb2 colb3 colb4

    ------ ------ ------ ------ ------ ------ -----------

    NULL A1 A2 A0 A1 A2 THIS IS IT

    A0 NULL A2 A0 A1 A2 THIS IS IT

    A0 A1 NULL A0 A1 A2 THIS IS IT

    A0 A1 A2 A0 A1 A2 THIS IS IT

  • A third possible solution:

    SELECT

    ta.cola1,

    ta.cola2,

    ta.cola3,

    colb1 = COALESCE(tb1.colb1, tb2.colb1, tb3.colb1),

    colb2 = COALESCE(tb1.colb2, tb2.colb2, tb3.colb2),

    colb3 = COALESCE(tb1.colb3, tb2.colb3, tb3.colb3),

    colb4 = COALESCE(tb1.colb4, tb2.colb4, tb3.colb4)

    FROM dbo.tabA ta

    LEFT JOIN dbo.tabB tb1 ON tb1.colb1 = ta.cola1

    LEFT JOIN dbo.tabB tb2 ON tb2.colb2 = ta.cola2

    LEFT JOIN dbo.tabB tb3 ON tb3.colb3 = ta.cola3

    Also, see this article by Dwain Camps[/url].

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Those JOIN conditions will always be met, regardless of the actual values in A (unless if a B column is NULL). So just do a CROSS JOIN of the tables:

    Select A.cola1,B.*

    from tabA as A CROSS JOIN tabB as B

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

  • Couldn't say a thanks yesterday as I was a bit occupied.

    Better late than never, so a big thanks for you.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply