Counting distinct periods of consecutive days

  • Comments posted to this topic are about the item Counting distinct periods of consecutive days

  • It looks to me like your example counts the last day in a consecutive day grouping as a visit. Here are some wrinkles:

    First, your data are pure dates. I see this sort of problem more often with input data that are date/time. Second, you are counting the visit on the last day. This is all well and good if you're selecting from an unfiltered data set. But what if you're needing to use this technique against a date range, say patient visits this month? How would you set it up so that your results from querying each month in the year will logically match the results for the whole year, etc.

  • report logic might have a slight flaw. I get why you'd do this for ER visits over midnight, but what if the patient comes in on Friday at noon, goes home and gets sick again or hit by a bus, shows up Saturday afternoon? Unless you're trying to treat these as root cause issues, and want to know that the 2nd visit was avoidable because the 1st didn't treat everything, which is very subjective.

    just something to think about, there's probably not a 100% correct answer to this one without clinical review

    ---------------------------------------------------------
    How best to post your question[/url]
    How to post performance problems[/url]
    Tally Table:What it is and how it replaces a loop[/url]

    "stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes."

  • casey-172678 (10/4/2012)


    It looks to me like your example counts the last day in a consecutive day grouping as a visit. Here are some wrinkles:

    First, your data are pure dates. I see this sort of problem more often with input data that are date/time. Second, you are counting the visit on the last day. This is all well and good if you're selecting from an unfiltered data set. But what if you're needing to use this technique against a date range, say patient visits this month? How would you set it up so that your results from querying each month in the year will logically match the results for the whole year, etc.

    I think the script actually counts the number of days that don't have a preceding day, so it should work correctly if you want to select ranges of days for reports.

    So if you want the number of visits that started in Jan 2012 you would add

    where tbl1.er_date between '1/1/2012' and '1/31/2012'

  • jcrawf02 (10/4/2012)


    report logic might have a slight flaw. I get why you'd do this for ER visits over midnight, but what if the patient comes in on Friday at noon, goes home and gets sick again or hit by a bus, shows up Saturday afternoon? Unless you're trying to treat these as root cause issues, and want to know that the 2nd visit was avoidable because the 1st didn't treat everything, which is very subjective.

    just something to think about, there's probably not a 100% correct answer to this one without clinical review

    Actually, we do that quite often when handling claims. We call them interim claims and allow up to three days between when combining. It's far more common to have someone come back because of further complications rather than something unrelated.

    Harsha,

    I've found that when you given one report like that, you'll be asked to do it another way. For a DOJ project, I used this so it could be cut many ways.

    IF OBJECT_ID('tempdb..#Claims') IS NOT NULL

    DROP TABLE #Claims

    IF OBJECT_ID('tempdb..#ClaimSpans') IS NOT NULL

    DROP TABLE #ClaimSpans

    CREATE TABLE #Claims (ClaimID int IDENTITY(1,1) PRIMARY KEY, PCN varchar(10), Admit date, Discharge date)

    INSERT INTO #Claims (PCN, Admit, Discharge)

    VALUES

    ('000000000A', '2012-01-01', '2012-01-03')

    , ('000000000A', '2012-01-05', '2012-01-09')

    , ('000000000A', '2012-01-10', '2012-01-11')

    , ('000000000A', '2012-01-12', '2012-01-12')

    , ('000000000A', '2012-01-14', '2012-01-19')

    , ('000000000A', '2012-01-20', '2012-01-31')

    , ('000000000B', '2012-01-05', '2012-01-09')

    , ('000000000B', '2012-01-10', '2012-01-11')

    , ('000000000B', '2012-02-05', '2012-02-09')

    , ('000000000B', '2012-02-10', '2012-02-11')

    , ('000000000B', '2012-02-12', '2012-02-14')

    , ('000000000B', '2012-02-15', '2012-02-16')

    , ('000000000B', '2012-02-27', '2012-02-28')

    , ('000000000B', '2012-03-9', '2012-03-11')

    , ('000000000C', '2012-03-05', '2012-03-09')

    , ('000000000D', '2012-04-10', '2012-04-11')

    ;WITH Buckets (RowNum, PCN, Yr, Mnth, Admit) as

    (

    SELECT

    ROW_NUMBER() OVER (PARTITION BY PCN ORDER BY PCN, Admit) as RowNum,

    PCN,

    YEAR(Admit) as Yr,

    MONTH(Admit) as Mnth,

    Admit

    FROM #Claims

    GROUP BY PCN, YEAR(Admit), MONTH(Admit), Admit

    )

    SELECT z.PCN, z.Yr, z.Mnth, z.Admit, y.Admit as PrevAdmit, DATEDIFF(d, y.Admit, z.Admit) as Days, z.RowNum, y.RowNum as PrevRowNum

    INTO #ClaimSpans

    FROM Buckets as z

    LEFT JOIN Buckets as y

    ON z.RowNum - 1 = y.RowNum AND z.PCN = y.PCN

    SELECT * FROM #ClaimSpans

    -- find claims for patients that came back between 2-4 days

    SELECT * FROM #ClaimSpans WHERE Days BETWEEN 2 and 4

    -- find the count of patients that started in one group below and either stayed within the same group or moved to another span

    ;WITH Buckets (PCN, Bucket, Days, OldRowNum, RowNum)

    as

    (

    SELECT

    PCN,

    CASE

    WHEN Days BETWEEN 0 AND 1 THEN '0 to 1'

    WHEN Days BETWEEN 2 AND 4 THEN '2 to 4'

    WHEN Days BETWEEN 4 AND 7 THEN '4 to 7'

    WHEN Days BETWEEN 8 AND 10 THEN '8 to 10'

    WHEN Days BETWEEN 11 AND 14 THEN '11 to 14'

    ELSE '15 to 20' END as Bucket,

    Days,

    RowNum as OldRowNum,

    ROW_NUMBER() OVER (PARTITION BY PCN ORDER BY PCN, RowNum) as RowNum -- recalc RowNum

    FROM #ClaimSpans

    WHERE RowNum > 1 -- first claim won't join to a previous claim

    AND Days <= 20 -- claims over 20 days won't count

    )

    SELECT

    r.FromBucket,

    r.ToBucket,

    COUNT(*) as Cnt

    FROM

    (

    SELECT

    a.Bucket as FromBucket,

    b.Bucket as ToBucket

    FROM Buckets as a

    JOIN Buckets as b

    ON a.PCN = b.PCN AND a.RowNum = b.RowNum - 1 AND a.OldRowNum = b.OldRowNum - 1 -- OldRowNum makes sure that there wasn't a break in the sequence, like when the 20+ days were pulled

    ) as r

    GROUP BY FromBucket, ToBucket

    -- clean up

    IF OBJECT_ID('tempdb..#Claims') IS NOT NULL

    DROP TABLE #Claims

    IF OBJECT_ID('tempdb..#ClaimSpans') IS NOT NULL

    DROP TABLE #ClaimSpans

    /* Anything is possible but is it worth it? */

  • Here's a good reference article.

    Group Islands of Contiguous Dates (SQL Spackle)

    http://www.sqlservercentral.com/articles/T-SQL/71550/

  • Thanks for the script.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply