Selecting the latest dated record containing a value

  • I have a set of medical testing results for numerous patients on different dates some with results and some without. The data comes from multiple sources and sometimes they may report on the same instance with or without results. The data is structured similar to the following example:

    [font="Courier New"]PatientID TestDate Result Source

    123456 03/31/2010 89 A

    123456 01/15/2010 98 B

    123456 10/27/2009 NULL A

    123456 10/27/2009 101 B

    789012 03/05/2010 NULL A

    789012 01/21/2010 121 B

    789012 12/15/2009 111 A

    345678 02/28/2010 NULL A

    345678 11/21/2009 NULL A

    345678 11/21/2009 95 B

    901234 02/14/2010 NULL B

    901234 09/14/2009 NULL B[/font]

    For each patient, I need to be able to identify the appropriate record to use for reporting given the following constraints:

    1. Find the latest record with a result and use the result from that record

    2. If there are no records with a result, then use the latest record anyway

    If I were doing this for a single patient, I would use something like the following:

    SELECT TOP 1 PatientID, TestDate, Result, Source

    FROM myTable

    WHERE PatientID = 789012

    ORDER BY CASE WHEN Result IS NULL THEN 0 ELSE 1 END DESC, TestDate

    My problem is, how do I accomplish this for all patients without resorting to REBAR? If I attempt to use aggregate functions such as MAX and MIN, it doesn’t allow me to keep the identity of the record for it could return the MAX date and MIN result from different records for a patient.

  • here you go Aaron;

    teh trick you want is to use hte ROW_NUMBER() function; that can give you "partitioned" results per PatientID, for example.

    if you can in the future, if you provide data in a format we can paste into SSMS, we can help you even faster:

    --results:

    RW PatientID TestDate Result Source

    1 123456 10/27/2009 101 B

    1 345678 11/21/2009 95 B

    1 789012 12/15/2009 111 A

    With MySampleData As (

    SELECT '123456' AS PatientID,

    '03/31/2010 89' AS TestDate,

    NULL AS Result,

    'A' AS Source UNION ALL

    SELECT '123456','01/15/2010','98',' B' UNION ALL

    SELECT '123456','10/27/2009',NULL,'A' UNION ALL

    SELECT '123456','10/27/2009','101','B' UNION ALL

    SELECT '789012','03/05/2010',NULL,'A' UNION ALL

    SELECT '789012','01/21/2010','121','B' UNION ALL

    SELECT '789012','12/15/2009','111','A' UNION ALL

    SELECT '345678','02/28/2010',NULL,'A' UNION ALL

    SELECT '345678','11/21/2009',NULL,'A' UNION ALL

    SELECT '345678','11/21/2009','95',' B' UNION ALL

    SELECT '901234','02/14/2010',NULL,'B' UNION ALL

    SELECT '901234','09/14/2009',NULL,'B')

    --we want the latest record with RESULTS, and just one record per patient,

    SELECT * FROM (

    SELECT row_number() OVER (PARTITION BY PatientID ORDER BY PatientID,TestDate DESC) AS RW,*

    FROM (SELECT * FROM MySampleData WHERE RESULT IS NOT NULL) X )MyAlias

    WHERE RW = 1

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • Thanks for your help, Lowell. I was concerned about putting in too much information that was not relevant to the issue, so I attempted to narrow it down a bit. However, I'll keep that in mind for the future.

    To be honest, I don't know why I didn't think of that approach myself! It sure seems straightforward enough. When I get it implemented, I'll let you know how it turns out.

  • As per your suggestion, Lowell, I'm posting the final solution in a manner that can be cut and pasted into SSMS. While your solution solved the majority of the problem, it didn't address everything but it sure gave me a huge jump start. Thank you!!!

    Here's the solution with the sample data:

    IF object_id(N'TempDB..#MySampleData') IS NOT NULL DROP TABLE #MySampleData;

    CREATE TABLE #MySampleData (

    PatientID int,

    TestDate datetime,

    Result int,

    Source char(1)

    );

    INSERT INTO #MySampleData VALUES ('123456','03/31/2010',89,'A');

    INSERT INTO #MySampleData VALUES ('123456','01/15/2010',98,'B');

    INSERT INTO #MySampleData VALUES ('123456','10/27/2009',NULL,'A');

    INSERT INTO #MySampleData VALUES ('123456','10/27/2009',101,'B');

    INSERT INTO #MySampleData VALUES ('789012','03/05/2010',NULL,'A');

    INSERT INTO #MySampleData VALUES ('789012','01/21/2010',121,'B');

    INSERT INTO #MySampleData VALUES ('789012','12/15/2009',111,'A');

    INSERT INTO #MySampleData VALUES ('345678','02/28/2010',NULL,'A');

    INSERT INTO #MySampleData VALUES ('345678','11/21/2009',NULL,'A');

    INSERT INTO #MySampleData VALUES ('345678','11/21/2009',95,'B');

    INSERT INTO #MySampleData VALUES ('901234','02/14/2010',NULL,'B');

    INSERT INTO #MySampleData VALUES ('901234','09/14/2009',NULL,'B');

    --we want the latest record with results (or just the latest if no results),

    --and just one record per patient

    WITH cteRows AS (

    SELECT row_number() OVER (PARTITION BY PatientID

    ORDER BY CASE WHEN Result IS NULL THEN 0 ELSE 1 END DESC,

    TestDate DESC) AS RW,*

    FROM #MySampleData

    )

    SELECT * FROM cteRows WHERE RW = 1

    For this example I had to load the data into a temp table so the dates would behave properly.

    Again, thanks a million!!

  • awesome Aaron;

    sometimes just a shove in the right direction gets you rolling;

    i had a brain block exactly like that last week, where i knew how to do something, but it wasn't obvious to me at the time to use the right technique;

    glad you got it the way you wanted!

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply