Counting distinct periods of consecutive days

 Author Message harsha.majety SSC Journeyman Group: General Forum Members Points: 75 Visits: 448 Comments posted to this topic are about the item Counting distinct periods of consecutive days casey-172678 Grasshopper Group: General Forum Members Points: 22 Visits: 17 It looks to me like your example counts the last day in a consecutive day grouping as a visit. Here are some wrinkles:First, your data are pure dates. I see this sort of problem more often with input data that are date/time. Second, you are counting the visit on the last day. This is all well and good if you're selecting from an unfiltered data set. But what if you're needing to use this technique against a date range, say patient visits this month? How would you set it up so that your results from querying each month in the year will logically match the results for the whole year, etc. jcrawf02 SSCrazy Group: General Forum Members Points: 2222 Visits: 19324 report logic might have a slight flaw. I get why you'd do this for ER visits over midnight, but what if the patient comes in on Friday at noon, goes home and gets sick again or hit by a bus, shows up Saturday afternoon? Unless you're trying to treat these as root cause issues, and want to know that the 2nd visit was avoidable because the 1st didn't treat everything, which is very subjective. just something to think about, there's probably not a 100% correct answer to this one without clinical review ---------------------------------------------------------How best to post your questionHow to post performance problemsTally Table:What it is and how it replaces a loop"stewsterl 80804 (10/16/2009)I guess when you stop and try to understand the solution provided you not only learn, but save yourself some headaches when you need to make any slight changes." John Rees Valued Member Group: General Forum Members Points: 65 Visits: 539 casey-172678 (10/4/2012)It looks to me like your example counts the last day in a consecutive day grouping as a visit. Here are some wrinkles:First, your data are pure dates. I see this sort of problem more often with input data that are date/time. Second, you are counting the visit on the last day. This is all well and good if you're selecting from an unfiltered data set. But what if you're needing to use this technique against a date range, say patient visits this month? How would you set it up so that your results from querying each month in the year will logically match the results for the whole year, etc.I think the script actually counts the number of days that don't have a preceding day, so it should work correctly if you want to select ranges of days for reports.So if you want the number of visits that started in Jan 2012 you would add`where tbl1.er_date between '1/1/2012' and '1/31/2012'` Gatekeeper SSC-Addicted Group: General Forum Members Points: 448 Visits: 888 jcrawf02 (10/4/2012)report logic might have a slight flaw. I get why you'd do this for ER visits over midnight, but what if the patient comes in on Friday at noon, goes home and gets sick again or hit by a bus, shows up Saturday afternoon? Unless you're trying to treat these as root cause issues, and want to know that the 2nd visit was avoidable because the 1st didn't treat everything, which is very subjective. just something to think about, there's probably not a 100% correct answer to this one without clinical reviewActually, we do that quite often when handling claims. We call them interim claims and allow up to three days between when combining. It's far more common to have someone come back because of further complications rather than something unrelated.Harsha, I've found that when you given one report like that, you'll be asked to do it another way. For a DOJ project, I used this so it could be cut many ways.`IF OBJECT_ID('tempdb..#Claims') IS NOT NULL DROP TABLE #ClaimsIF OBJECT_ID('tempdb..#ClaimSpans') IS NOT NULL DROP TABLE #ClaimSpansCREATE TABLE #Claims (ClaimID int IDENTITY(1,1) PRIMARY KEY, PCN varchar(10), Admit date, Discharge date)INSERT INTO #Claims (PCN, Admit, Discharge) VALUES ('000000000A', '2012-01-01', '2012-01-03') , ('000000000A', '2012-01-05', '2012-01-09') , ('000000000A', '2012-01-10', '2012-01-11') , ('000000000A', '2012-01-12', '2012-01-12') , ('000000000A', '2012-01-14', '2012-01-19') , ('000000000A', '2012-01-20', '2012-01-31') , ('000000000B', '2012-01-05', '2012-01-09') , ('000000000B', '2012-01-10', '2012-01-11') , ('000000000B', '2012-02-05', '2012-02-09') , ('000000000B', '2012-02-10', '2012-02-11') , ('000000000B', '2012-02-12', '2012-02-14') , ('000000000B', '2012-02-15', '2012-02-16') , ('000000000B', '2012-02-27', '2012-02-28') , ('000000000B', '2012-03-9', '2012-03-11') , ('000000000C', '2012-03-05', '2012-03-09') , ('000000000D', '2012-04-10', '2012-04-11');WITH Buckets (RowNum, PCN, Yr, Mnth, Admit) as( SELECT ROW_NUMBER() OVER (PARTITION BY PCN ORDER BY PCN, Admit) as RowNum, PCN, YEAR(Admit) as Yr, MONTH(Admit) as Mnth, Admit FROM #Claims GROUP BY PCN, YEAR(Admit), MONTH(Admit), Admit)SELECT z.PCN, z.Yr, z.Mnth, z.Admit, y.Admit as PrevAdmit, DATEDIFF(d, y.Admit, z.Admit) as Days, z.RowNum, y.RowNum as PrevRowNumINTO #ClaimSpansFROM Buckets as zLEFT JOIN Buckets as y ON z.RowNum - 1 = y.RowNum AND z.PCN = y.PCNSELECT * FROM #ClaimSpans-- find claims for patients that came back between 2-4 daysSELECT * FROM #ClaimSpans WHERE Days BETWEEN 2 and 4-- find the count of patients that started in one group below and either stayed within the same group or moved to another span;WITH Buckets (PCN, Bucket, Days, OldRowNum, RowNum)as(SELECT PCN, CASE WHEN Days BETWEEN 0 AND 1 THEN '0 to 1' WHEN Days BETWEEN 2 AND 4 THEN '2 to 4' WHEN Days BETWEEN 4 AND 7 THEN '4 to 7' WHEN Days BETWEEN 8 AND 10 THEN '8 to 10' WHEN Days BETWEEN 11 AND 14 THEN '11 to 14' ELSE '15 to 20' END as Bucket, Days, RowNum as OldRowNum, ROW_NUMBER() OVER (PARTITION BY PCN ORDER BY PCN, RowNum) as RowNum -- recalc RowNumFROM #ClaimSpansWHERE RowNum > 1 -- first claim won't join to a previous claim AND Days <= 20 -- claims over 20 days won't count)SELECT r.FromBucket, r.ToBucket, COUNT(*) as CntFROM( SELECT a.Bucket as FromBucket, b.Bucket as ToBucket FROM Buckets as a JOIN Buckets as b ON a.PCN = b.PCN AND a.RowNum = b.RowNum - 1 AND a.OldRowNum = b.OldRowNum - 1 -- OldRowNum makes sure that there wasn't a break in the sequence, like when the 20+ days were pulled) as r GROUP BY FromBucket, ToBucket-- clean upIF OBJECT_ID('tempdb..#Claims') IS NOT NULL DROP TABLE #ClaimsIF OBJECT_ID('tempdb..#ClaimSpans') IS NOT NULL DROP TABLE #ClaimSpans` /* Anything is possible but is it worth it? */ a.myasnikov SSC Rookie Group: General Forum Members Points: 47 Visits: 141 Here's a good reference article. Group Islands of Contiguous Dates (SQL Spackle)http://www.sqlservercentral.com/articles/T-SQL/71550/ Iwas Bornready SSChampion Group: General Forum Members Points: 13704 Visits: 885 Thanks for the script.