Using 'row_number() over (partition by' to get a count

  • I have a CTE successfully using 'row_number() over (partition by' to return a list of patients with more than one diagnosis using 'and CTE.Row_Num > 1'. The challenge is I need to add a couple of lines to eliminate patients that have more than one 'dx_class_id' with the value of 276. I've tried adding a partition to count 'dx_class_id' where the count > 1: 'count(d.dx_class_id) over (partition by d.dx_class_id order by d.dx_class_id) Row_Num2' and 'and CTE.Row_Num2 = 1' like the first block of code but it returns no records, any ideas?:

    with CTE
    as (select distinct
    d.pat_id
    , d.end_dt
    , d.dx_id
    , row_number() over (partition by d.pat_id order by d.pat_id) Row_Num
    , count(d.dx_class_id) over (partition by d.dx_class_id order by d.dx_class_id) Row_Num2
    from dbo.tb_pat_diags d
    where d.dx_class_id = 276
    and d.end_dt is null
    or d.end_dt > dateadd(dd, -180, getdate())
    and d.active = 1)

    select
    CTE.pat_id
    from CTE
    inner join dbo.vw_claims cl
    on cl.pat_id = CTE.pat_id
    where cl.clm_dos > dateadd(dd, -180, getdate())
    and cl.pat_id is not null
    and cl.stat_id = 1
    and cl.curnt_entry = 1
    and CTE.Row_Num > 1
    and CTE.Row_Num2 = 1
    group by CTE.pat_id
    order by CTE.pat_id;
    with CTE
    as (select distinct
    d.pat_id
    , d.end_dt
    , d.dx_id
    , row_number() over (partition by d.pat_id order by d.pat_id) Row_Num
    from dbo.tb_pat_diags d
    where d.dx_class_id = 276
    and d.end_dt is null
    or d.end_dt > dateadd(dd, -180, getdate())
    and d.active = 1)

    select
    CTE.pat_id
    from CTE
    inner join dbo.vw_claims cl
    on cl.pat_id = CTE.pat_id
    where cl.clm_dos > dateadd(dd, -180, getdate()) -- claim in the past 180 days
    and cl.pat_id is not null
    and cl.stat_id = 1
    and cl.curnt_entry = 1
    and CTE.Row_Num > 1
    group by CTE.pat_id
    order by CTE.pat_id;
  • It looks like you would want this count(d.dx_class_id) over (partition by d.dx_class_id order by d.dx_class_id) Row_Num2

     

    Partitioned by pat_id not dx_class_id

  • It's still returning no records. If I eliminate the COUNT I get 3,000 + but about 700 have multiple '276' counts that need to not be returned.

  • Hold up for now, I think my data is flawed.

  • I double checked my data and it looks good but the counter may not be working:

    with CTE
    as (select distinct
    d.pat_id
    , d.end_dt
    , d.dx_id
    , row_number() over (partition by d.pat_id order by d.pat_id) Row_Num -- count of the dxs in tb_pat_diags table
    , count(d.dx_class_id) over (partition by d.pat_id order by d.pat_id) Row_Num2 -- count of the 276s
    from dbo.tb_pat_diags d
    where d.dx_class_id = 276
    and d.end_dt is null
    or d.end_dt > dateadd(dd, -180, getdate())
    and d.active = 1)

    select
    CTE.pat_id
    from CTE
    inner join dbo.vw_claims cl
    on cl.pat_id = CTE.pat_id
    where cl.clm_dos > dateadd(dd, -180, getdate())
    and cl.pat_id is not null
    and cl.stat_id = 1
    and cl.curnt_entry = 1
    and CTE.Row_Num > 1 -- patient has multiple dxs in tb_pat_diags table
    and CTE.Row_Num2 < 2 -- patient has no more than one 276 in the tb_pat_diags table
    group by CTE.pat_id
    order by CTE.pat_id;
  • Well what do you get when you comment out the Row_Num2 check in the where clause and include it in the output?

    Also keep in mind in your CTE the count is happening after the where clause is applied so it won't be the total count of records with 276 in that field for each pat_id.

  • I would not use a row_number. Row_number is useful for returning a specific row. The requirement is to identify pat_ids that have multiple dx ,but do not have multiple excluded dx. I would create two sets of pat_id and compare them, before checking against claims.

    If we continue with the row_number approach, there is a problem with the posted join to claims. Row_num > 1 will return duplicates if the pat_id has > 2 dx_id, but using Row_num = 2 will return a single pat_id with 2 or more dx.

    Row_Num2 isn't in anyway correlated with the excluded dx_class_id, however, we can sum(case) to create a count of excluded dx. This is not how I would do it, but it might work. I have guessed what the end_dt and active columns mean so the params may well be wrong.

    with CTE
    as (select d.pat_id,
    row_number() over (partition by d.pat_id order by d.pat_id) Row_Num,
    sum(case when d.dx_class_id = 276 then 1 else 0 end) over (partition by pat_id) AS Row_Num2
    from dbo.tb_pat_diags as d
    where d.end_dt is null
    or (d.end_dt > dateadd(dd, -180, getdate()) and d.active = 1)
    )


    Unless the claim detail is required there is no need to join fully to claims and a semi join is sufficient. If the claim detail is required then a join would be necessary.

    select  a.pat_id
    from CTE AS a
    where a.Row_Num =2 -- Single pat_id for patients with row_num >= 1
    and a.Row_Num2 <2 -- Count excluded dx class < 2
    and exists ( select 1
    from dbo.vw_claims as cl
    where cl.pat_id = a.pat_id
    and cl.clm_dos > dateadd(dd, -180, getdate())
    and cl.stat_id = 1
    and cl.curnt_entry = 1
    )

     

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply