Left join with "soft" condition

  • Hello everybody,

    I need to join two tables based on patient ID and dates.The first table has the patient ID and dates of some test.

    t1

    ID Dt1

    1 05-jan-2010

    1 14-mar-2011

    2 02-feb-2009

    2 17-jan-2010

    The second has dates of some other test with a few additional parameters.

    t2

    ID Dt2 Param1 Param2

    1 17-nov-2009 1 5

    1 12-feb-2010 1

    1 27-mar-2010 3 5

    1 03-jan-2011 3

    1 20-feb-2011 5 5

    1 15-apr-2011 2 1

    2 12-jan-2009

    2 27-feb-2009 7

    2 19-mar-2009 3

    2 25-dec-2009 3

    2 08-mar-2010 1 2

    The problem is that the dates may or may not coincide so I need to add the records from the second table based on the closest date:

    ID Dt1 Dt2 Param1 Param2

    1 05-jan-2010 12-feb-2010 1

    1 14-mar-2011 20-feb-2011 5 5

    2 02-feb-2009 27-feb-2009 7

    2 17-jan-2010 25-dec-2009 3

    I am hoping you experts can help me with a right SQL statement. I don't really need the second date to appear in the resultset, this is just to indicate which records should be picked.

    Thank you in advance

    Alex

  • Hi Alex. Welcome to the forums. It would be greatly beneficial if you could provide ddl and sample data in a consumable format. You can get some details about how to do that by following the first link in my signature. Additionally it would be greatly helpful if you could explain the business rules for this.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • Hi Alex,

    I was in a good mood and prepared the sample data, but I might not do it again, it's just to show you how it should be done.

    I'm including 2 possible solutions but none of them will perform great on a large number of data. You should test both because even if the CTE seems to work better with the sample data, it might change with the distribution of real data.

    Be sure to understand what's going on and post any questions you have.

    -- This is DDL and sample data as you should post it, it's not part of the solution

    CREATE TABLE #t1(

    ID int,

    Dt1 date)

    INSERT INTO #t1

    VALUES

    (1, '05-jan-2010'),

    (1, '14-mar-2011'),

    (2, '02-feb-2009'),

    (2, '17-jan-2010')

    CREATE TABLE #t2(

    ID int,

    Dt2 date,

    Param1 int,

    Param2 int)

    INSERT INTO #t2 VALUES(

    1, '17-nov-2009', 1, 5),(

    1, '12-feb-2010', 1, NULL),(

    1, '27-mar-2010', 3, 5),(

    1, '03-jan-2011', 3, NULL),(

    1, '20-feb-2011', 5, 5),(

    1, '15-apr-2011', 2, 1),(

    2, '12-jan-2009', NULL, NULL),(

    2, '27-feb-2009', 7, NULL),(

    2, '19-mar-2009', 3, NULL),(

    2, '25-dec-2009', 3, NULL),(

    2, '08-mar-2010', 1, 2);

    --DDL & Sample ends here and solution starts

    --Solution using CTE with ROW_NUMBER()

    WITH CTE AS(

    SELECT t1.ID,

    t1.Dt1,

    t2.Param1,

    t2.Param2,

    ROW_NUMBER() OVER( PARTITION BY t1.ID, t1.Dt1 ORDER BY ABS( DATEDIFF( DD, t1.Dt1, t2.Dt2))) rn

    FROM #t1 t1

    LEFT

    JOIN #t2 t2 ON t1.ID = t2.ID

    )

    SELECT ID,

    Dt1,

    Param1,

    Param2

    FROM CTE

    WHERE rn = 1

    --Solution using OUTER APPLY and TOP 1

    SELECT *

    FROM #t1 t1

    OUTER APPLY( SELECT TOP 1 Param1, Param2

    FROM #t2 t2

    WHERE t1.ID = t2.ID

    ORDER BY ABS( DATEDIFF( DD, t1.Dt1, t2.Dt2)))x

    --Clean Test data

    DROP TABLE #t1

    DROP TABLE #t2

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • +1 to Luis for your good/charitable mood and two excellent solutions.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • Hi Luis,

    Thanks a lot for the solutions! And I have learned my lesson from you and Sean how to properly ask questions )))

    Now my part will be to adopt the statements to the system I need the solution in (it's SAS). But this is a great start.

    Thanks again!

    Alex

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply