Difficult recursive query problem

  • I'm trying to query our health care database to find the set of patients who meet certain diagnosis criteria.  The tricky thing is the set of exclusions.  I've summarized them in the example below (the real problem has long lists of criteria).  As you can see, my query selects the wrong set due to the ANDing of a pair of false as well as a pair of true.

    To summarize:  Hospital systems may assign the same ID number without knowing the same number is in use at another hospital.  So Patient Jones is identified not only by ID number but also by facility number, IE: Patient 1, Facility 1.  Also, Patient Jones may only have one diagnosis but Patient Smith may have been in a plane crash and have dozens of injuries.  So the diagnosis sequence is anything from 1 to as many problems as one can have and still make it to the hospital.

    Then as for the diagnosis in question, the criteria has a bunch of co-dependent diagnosis.  So if a patient has an injury during birth but was delivered on the way to the hospital, then that three diagnosis of 'delivery','injured during delivery', and 'delivered outside the hospital'.  Since the report in question wants to know about babies delivered with injuries that occurred IN the hospital, then that patient should NOT be counted.  And there are a LOT more variables but you get the point.

    Any help is greatly appreciated!  Also, creating intermediary views is not a problem so if you can think of one that will help, that's perfectly fine.

    Oh, and we use SQL Server 2000  so I can't use a handy common runtime expression to do anything tricky; it all has to be a TSQL.


    create table #diagnosis (

    hospital_id int,

    patient_id int,

    diagnosis_seq int,

    diagnosis_code char(2))

    --Rules : Select distinct patients (facility/patient ID combos) with

    --an E code as long as there is no V1 code with an R1 code

    --or a V2 code with an R2 code.

    --This patient should be counted:

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (1,1,1,'E1')

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (1,1,2,'E2')

    --This patient should be counted: (even though there's a V1 there isn't an R1)

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (1,2,1,'V1')

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (1,2,2,'E1')

    --This patient should be counted: (even though there's a R1 there isn't an V1, it's a V2)

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (2,11,1,'E1')

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (2,11,2,'R1')

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (2,11,2,'V2')

    --This patient should not be counted:

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (3,1,1,'E2')

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (3,1,2,'R1')

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (3,1,3,'V1')

    --This patient should not be counted:

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (4,21,1,'E3')

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (4,21,2,'V2')

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (4,21,3,'R2')

    --This patient should not be counted:

    insert #diagnosis (hospital_id,patient_id,diagnosis_seq,diagnosis_code) values (5,31,1,'Q5')

    select distinct d1.hospital_id, d1.patient_id from #diagnosis d1

    left join #diagnosis d2

    ON d2.hospital_id = d1.hospital_id

    and d2.patient_id = d1.patient_id

    and d2.diagnosis_code = 'V1'

    left join #diagnosis d3

    ON d3.hospital_id = d1.hospital_id

    and d3.patient_id = d1.patient_id

    and d3.diagnosis_code = 'R1'

    left join #diagnosis d4

    on d4.hospital_id = d1.hospital_id

    and d4.patient_id = d1.patient_id

    and d4.diagnosis_code = 'V2'

    left join #diagnosis d5

    on d5.hospital_id = d1.hospital_id

    and d5.patient_id = d1.patient_id

    and d5.diagnosis_code = 'R2'

    where

    (

    (d2.hospital_id is null

    AND

    d3.hospital_id is null

    )

    OR

    (d4.hospital_id is null

    AND

    d5.hospital_id is null

    )

    )

    AND

    d1.diagnosis_code like 'E%'

    drop table #diagnosis

  • Not necessarily the most elegant solution, but based on your sample data, it seems to work:

    select distinct

        d1.hospital_id,

        d1.patient_id

    from

        #diagnosis d1

    where

        exists (select 1 from #diagnosis d2 where d1.hospital_id = d2.hospital_id and d1.patient_id = d2.patient_id and d2.diagnosis_code like 'E%')

        and not ((exists (select 1 from #diagnosis d2 where d1.hospital_id = d2.hospital_id and d1.patient_id = d2.patient_id and d2.diagnosis_code = 'R1')

            and exists (select 1 from #diagnosis d2 where d1.hospital_id = d2.hospital_id and d1.patient_id = d2.patient_id and d2.diagnosis_code = 'V1'))

            or (exists (select 1 from #diagnosis d2 where d1.hospital_id = d2.hospital_id and d1.patient_id = d2.patient_id and d2.diagnosis_code = 'R2')

            and exists (select 1 from #diagnosis d2 where d1.hospital_id = d2.hospital_id and d1.patient_id = d2.patient_id and d2.diagnosis_code = 'V2')))

    order by

        d1.hospital_id,

        d1.patient_id

     

  • Here's an alternative. Not saying it's better, it may perform better on large datasets. It uses a derived tables which sets bit flags to indicate if certain conditions are met across the set:

    Select hospital_id, patient_id

    From

    (

      Select hospital_id, patient_id,

        Max(Case When diagnosis_code Like 'E%' Then 1 Else 0 End) As E_bit,

        Max(Case When diagnosis_code =  'V1' Then 1 Else 0 End) As V1_bit,

        Max(Case When diagnosis_code =  'R1' Then 1 Else 0 End) As R1_bit,

        Max(Case When diagnosis_code =  'V2' Then 1 Else 0 End) As V2_bit,

        Max(Case When diagnosis_code =  'R2' Then 1 Else 0 End) As R2_bit

      From #diagnosis

      Group By hospital_id, patient_id

    ) dtBits

    Where E_bit = 1

    And  Not (V1_bit = 1 And R1_bit = 1)

    And  Not (V2_bit = 1 And R2_bit = 1)

  • Thanks both for the help;  After some experiementing, Ten Centuries' method runs a lot slower but it is a LOT easier to read and thus to maintain in the future.

    The final query, if you're curious, turned out to be this beast here (as you can see with so many diagnosis in the clauses it makes it easier to have a neat list of the 1/0 flags.  This also will make it easy to add yet more choices when the inevitable list of 'oh, we forgot to exlude these when this other case happens' comes along):

    select df.facility_name ,Flags.facility_id_key ,dd.Quarter_Number_In_Fiscal_Year

    ,dd.Fiscal_Year_Number,count(distinct Flags.patient_id_key) as Tally

    from

    (select distinct patient_id_key, facility_id_key, patient_control_number

    ,Discharge_Date_Key,Service_Date_Key,Discharge_DRG_Key

    , max(case when ICD9_Diagnosis_Code in

    ('765.01','765.02','765.03','765.04','765.05','765.06','765.07'

    ,'765.08','765.11','765.12','765.13','765.14',

    '765.15','765.16','765.17','765.18','765.21','765.22','765.23'

    ,'765.24','765.25','765.26','765.27') Then 1 Else 0 End) As Part1

    , max(case when ICD9_Diagnosis_Code in ('767.0') Then 1 Else 0 End) As Part2

    , max(case when ICD9_Diagnosis_Code in ('767.3','767.4') Then 1 Else 0 End) As Part3

    , max(case when ICD9_Diagnosis_Code in ('756.51') Then 1 Else 0 End) As Part4

    , max(case when d1.ICD9_Diagnosis_Code in

    ('767.0','767.11','767.3','737.4','767.7','767.8','767.9') Then 1 Else 0 End) As Part5

    From Fact_Diagnosis_ICD9 f2

    inner join dbo.DIM_Diagnosis_ICD9 d1

    on f2.ICD9_Diagnosis_ID_Key = d1.ICD9_Diagnosis_ID

    group by patient_id_key, facility_id_key, patient_control_number

    ,Discharge_Date_Key,Service_Date_Key,Discharge_DRG_Key

    ) Flags

    inner join dbo.DIM_Date dd

    on Flags.Discharge_Date_Key = dd.Date_ID

    inner join fact_patient_visit fpv

    on fpv.facility_id_key = Flags.facility_id_key

    and fpv.patient_control_number = Flags.patient_control_number

    and fpv.Patient_ID_Key = Flags.Patient_ID_Key

    and fpv.Discharge_Date_Key = Flags.Discharge_Date_Key

    and fpv.Service_Date_Key = Flags.Service_Date_Key

    and fpv.Discharge_DRG_Key = Flags.Discharge_DRG_Key

    inner join dim_facility df

    on df.facility_id = fpv.facility_id_key

    where

    Part5 = 1

    and not (part1 =1 and part2 =1)

    and not (part3 =1 and part4 =1)

    and dd.Fiscal_Year_number >= 2005

    and Patient_Type_Key = 1

    and Bill_Status_ID_Key = 1

    and fpv.Current_Record_Indicator = 'Y'

    group by df.mbo,df.facility_name,Flags.facility_id_key,dd.Fiscal_Year_Number

    ,dd.Quarter_Number_In_Fiscal_Year

    order by df.mbo,df.facility_name,dd.Fiscal_Year_Number,dd.Quarter_Number_In_Fiscal_Year

    Thanks again!

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply