﻿<?xml version='1.0' encoding='UTF-8'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>SQLServerCentral / SQL Server 2008 / T-SQL (SS2K8)  / Consolidate Overlapping Date Periods / Latest Posts</title><generator>InstantForum.NET v2.9.0</generator><description>SQLServerCentral</description><link>http://www.sqlservercentral.com/Forums/</link><webMaster>notifications@sqlservercentral.com</webMaster><lastBuildDate>Wed, 19 Jun 2013 16:12:32 GMT</lastBuildDate><ttl>20</ttl><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>Looks darn near a tie to me, but if you're buying the beers on our next fishing trip, I'll concede to your snapper.</description><pubDate>Mon, 01 Oct 2012 09:12:01 GMT</pubDate><dc:creator>dwain.c</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>[quote][b]dwain.c (10/1/2012)[/b][hr]:-D[quote][b]ChrisM@Work (9/28/2012)[/b][hr][quote][b]dwain.c (9/27/2012)[/b][hr]Phil,I think this is a pretty snappy query too.&amp;lt;&amp;lt;snip&amp;gt;&amp;gt;Chris - I like what you did with the CROSS APPLY and I was able to do that in mine (to eliminate the Tally CTE) and it sped up quite dramatically.[/quote]Thanks Dwain. It took a while to figure out.[quote]Why are you using DENSE_RANK instead of ROW_NUMBER?[/quote]Because my first test harness used a PK which excluded the startdate - it was a much more comprehensive test with different types of overlap.[quote][b]dwain.c (9/30/2012)[/b][hr]&amp;lt;&amp;lt;snip&amp;gt;&amp;gt;I was able to adapt Mr. Ben-Gan's approach to this problem:&amp;lt;&amp;lt;snip&amp;gt;&amp;gt;Shows that I must have done something right because this solution seems to win out:[code="plain"]------ bteraberry's query SQL Server Execution Times:   CPU time = 15834 ms,  elapsed time = 10126 ms.------ Mark's query SQL Server Execution Times:   CPU time = 7051 ms,  elapsed time = 4544 ms.------ Dwain's query SQL Server Execution Times:   CPU time = 17207 ms,  elapsed time = 22379 ms.------ IBG's query (built by Dwain.C for this case) SQL Server Execution Times:   CPU time = 718 ms,  elapsed time = 756 ms.------ ChrisM's query SQL Server Execution Times:   CPU time = 11169 ms,  elapsed time = 16211 ms.------ bterraberry's query (2) SQL Server Execution Times:   CPU time = 999 ms,  elapsed time = 1019 ms.[/code][/quote]Not for long [code]------ bteraberry's query SQL Server Execution Times:   CPU time = 14914 ms,  elapsed time = 11965 ms. ------ Mark's query SQL Server Execution Times:   CPU time = 6739 ms,  elapsed time = 5010 ms. ------ Dwain's query SQL Server Execution Times:   CPU time = 17534 ms,  elapsed time = 23217 ms. ------ IBG's query (built by Dwain.C for this case) SQL Server Execution Times:   CPU time = 936 ms,  elapsed time = 1005 ms. ------ IBG's query (built by ChrisM for this case) SQL Server Execution Times:   CPU time = 952 ms,  elapsed time = 949 ms. ------ ChrisM's query SQL Server Execution Times:   CPU time = 11903 ms,  elapsed time = 13277 ms. ------ bterraberry's query (2) SQL Server Execution Times:   CPU time = 1326 ms,  elapsed time = 1333 ms. [/code]It uses a faster fish:[code="sql"];WITH CTE AS (    SELECT HotelID, RoomTypeID, ts=StartDate, Type = 1 		,e = NULL        ,s = (2*ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY StartDate))-1    FROM #RoomDates     UNION ALL    SELECT HotelID, RoomTypeID, ts=EndDate, Type = -1        ,e = (2*ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY EndDate))        ,s = NULL    FROM #RoomDates )SELECT @HotelID=HotelID, @RoomTypeID=RoomTypeID, @SD = MIN(ts), @ED = MAX(ts) FROM (	SELECT HotelID, RoomTypeID, ts, 		Snapper = (ROW_NUMBER() OVER(ORDER BY HotelID, RoomTypeID, ts)+1)/2	FROM (		SELECT HotelID, RoomTypeID, ts, s, e, 			se = ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY ts, Type DESC)		FROM CTE	) c1 	WHERE se IN (e,s)) C3GROUP BY HotelID, RoomTypeID, SnapperORDER BY HotelID, RoomTypeID, Snapper[/code]</description><pubDate>Mon, 01 Oct 2012 09:02:01 GMT</pubDate><dc:creator>ChrisM@Work</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>[quote][b]CELKO (9/30/2012)[/b][hr][quote] That's certainly one way to do it but that typically results in an OR in the JOIN or WHERE clause which can cause performance problems.  With that thought in mind, I'll close the interval with a very high date such as 9999-12-30.  Notice that I don't use the last date available (9999-12-31) in SQL Server so that I can use the classic and easily SARGable .. [/quote]I don't think the performance hit is that bad today; I find I usually write "x_ date BETWEEN start_date AND COALESCE (end_date, CURRENT_TIMESTAMP)" since it preserves the truth. I cannot see the future, but I am still alive today :-)  But more than that, we have DATE data types today, so the old Sybase dialect half-open trick is not needed any more. The sensible, easily read BETWEEN is good now, after all the decades of dialect.[/quote]Between won't work properly if times are involved.  And, although I agree that the relatively new DATE datatype is a Godsend for some, I always enjoy it when someone decides they really do want "time" to be a component of the column and their code starts returning whacky numbers after they change the DATE columns to DATETIME columns.  Using the method I suggested, that whacky numbers won't happen even for such a change.I agree that "x_ date BETWEEN start_date AND COALESCE (end_date, CURRENT_TIMESTAMP)"  does a reasonable job (including an index seek for the start_date) for problems up until "now" but it doesn't handle outlying date problems for future reservations and the like. </description><pubDate>Mon, 01 Oct 2012 06:48:11 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>:-D[quote][b]ChrisM@Work (9/28/2012)[/b][hr][quote][b]dwain.c (9/27/2012)[/b][hr]Phil,I think this is a pretty snappy query too.[code="sql"];WITH Tally (n) AS (        SELECT TOP (SELECT 1+MAX(DATEDIFF(day, StartDate, EndDate)) FROM #RoomDates)            ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1            FROM sys.all_columns),    GroupDates AS (        SELECT HotelID, RoomTypeID            ,[Date]            ,DateGroup = DATEADD(day                    , -ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY [Date]), [Date])        FROM #RoomDates        CROSS APPLY (            SELECT n, [Date]=DATEADD(day, n, StartDate)             FROM Tally             WHERE DATEADD(day, n, StartDate) BETWEEN StartDate AND EndDate) a        GROUP BY HotelID, RoomTypeID, [Date])SELECT HotelID, RoomTypeID    ,StartDate=MIN([Date])    ,EndDate=MAX([Date])FROM GroupDates[/code]This is the first time I've been able to successfully apply Jeff Moden's method for "Grouping Islands of Contiguous Dates" [url]http://www.sqlservercentral.com/articles/T-SQL/71550/[/url]!Not saying I fully understand it but at least I can go through the motions now. :-D[b]Edit:[/b] Eliminated an unnecessary CROSS APPLY.[/quote]Jeff's article was the inspiration for this effort too;[code="sql"]SELECT HotelId, RoomTypeId, startdate = MIN(startdate), enddate = MAX(enddate)FROM (	SELECT 		HotelId, RoomTypeId, startdate, enddate,		Grouper = DATEADD(day, 			0-DENSE_RANK() OVER (PARTITION BY HotelId, RoomTypeId ORDER BY InDate), 			InDate) 	FROM #RoomDates	CROSS APPLY (		SELECT TOP(1+DATEDIFF(DAY,startdate,enddate)) 			InDate = DATEADD(day,				(ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1),				startdate) 		from sys.columns	) x) cGROUP BY HotelId, RoomTypeId, Grouper[/code][/quote]Chris - I like what you did with the CROSS APPLY and I was able to do that in mine (to eliminate the Tally CTE) and it sped up quite dramatically.Why are you using DENSE_RANK instead of ROW_NUMBER?</description><pubDate>Mon, 01 Oct 2012 00:00:18 GMT</pubDate><dc:creator>dwain.c</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>[quote][b]Jeff Moden (9/28/2012)[/b][hr]Most of you know me.  If I thought a Tally Table would be the best thing for this problem, I'd be on it like white on rice.  The Tally Table and all of it's wonderful different renditions are not the way to go on something like this, IMHO.I'll also state that I've not done a performance comparision with any of the code offered, so far.  I can only state that I've done previous comparisons and I've found one bit of code that, at the time, blew away the competition.  That code comes from Itzik Ben-Gan and may be found at the following URL.  Like SSC, it requires simple spam-free membership but it's absolute worth the time to sign up.  Here's the URL for "Packing Date Intervals" such as what is being done on this thread.[url]http://www.solidq.com/sqj/Pages/2011-March-Issue/Packing-Intervals.aspx[/url][/quote]I was able to adapt Mr. Ben-Gan's approach to this problem:[code="sql"]-- Method by Itzik Ben-Gan-- http://www.solidq.com/sqj/Pages/2011-March-Issue/Packing-Intervals.aspx ;WITH C1 AS (        SELECT HotelID, RoomTypeID, ts=StartDate, Type=1, e=NULL            ,s=ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY StartDate)        FROM #RoomDates        UNION ALL        SELECT HotelID, RoomTypeID, ts=EndDate, Type=-1            ,e=ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY EndDate)            ,s=NULL        FROM #RoomDates),    C2 AS (        SELECT C1.*            ,se=ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY ts, Type DESC)        FROM C1),    C3 AS (        SELECT HotelID, RoomTypeID, ts            ,grpnm=FLOOR((ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY ts)-1) / 2 + 1)        FROM C2        WHERE COALESCE(s-(se-s)-1, (se-e)-e) = 0)SELECT HotelID, RoomTypeID, StartDate=MIN(ts), EndDate=MAX(ts)FROM C3GROUP BY HotelID, RoomTypeID, grpnm[/code]It is very clever for sure.Testing of the proposed solutions using this test harness:[code="sql"]-- Create a tally tableif object_id('tempdb.dbo.#Nums', 'U') is nullBEGINSELECT TOP(1000000) n = ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1INTO #NumsFROM sys.columns a, sys.columns b, sys.columns cCREATE UNIQUE CLUSTERED INDEX UCX_n ON #Nums (n)ENDif object_id('tempdb.dbo.#RoomDates', 'U') is not null    drop table #RoomDates  create table #RoomDates (    HotelId int not null    ,RoomTypeId int not null    ,StartDate date not null    ,EndDate date not null    ,constraint [PK_RD1] primary key clustered (      HotelId      ,RoomTypeId      ,StartDate      ) with (      pad_index = off      ,statistics_norecompute = off      ,ignore_dup_key = off      ,allow_row_locks = on      ,allow_page_locks = on      ) on [PRIMARY]    ) on [PRIMARY]            insert #RoomDates ( HotelId      ,RoomTypeId      ,StartDate      ,EndDate)    values (1,1,'2012-01-01', '2012-02-01'), (1,1,'2012-01-10', '2012-02-05'), (1,1,'2012-02-4', '2012-03-01'),           (2,1,'2012-01-01', '2012-02-01'), (2,1,'2012-01-10', '2012-01-20'),           (3,1,'2012-01-01', '2012-02-01'), (3,1,'2012-03-10', '2012-03-20')        INSERT #RoomDates (HotelID, RoomTypeID, StartDate, EndDate)    SELECT 10+n1, n2        ,StartDate, EndDate=DATEADD(day, ABS(CHECKSUM(NEWID())) % 20, StartDate)    FROM (SELECT n1=n FROM #Nums WHERE n BETWEEN 1 AND 50) a    CROSS APPLY (SELECT n2=n FROM #Nums WHERE n BETWEEN 1 AND 60) b    CROSS APPLY (SELECT n3=n FROM #Nums WHERE n BETWEEN 1 AND 80) c    CROSS APPLY (SELECT StartDate=DATEADD(day, n3*6, '2009-01-01')) d--SELECT COUNT(*) FROM #RoomDates        --select * from #RoomDatesDECLARE @HotelID INT, @RoomTypeID INT, @SD DATE, @ED DATE, @StartDT DATETIME, @EndDT DATETIME    --select HotelId    --  ,RoomTypeId    --  ,StartDate    --  ,EndDate    --  from (values (1,1,'2012-01-01', '2012-03-01'),    --               (2,1,'2012-01-01', '2012-02-01'),     --               (3,1,'2012-01-01', '2012-02-01'),     --               (3,1,'2012-03-10', '2012-03-20')) DesiredResults (HotelId, RoomTypeId, StartDate, EndDate)SET NOCOUNT ONPRINT '------ bteraberry''s query'SET STATISTICS TIME ONselect	@HotelID=dt.HotelId	,	@RoomTypeID=dt.RoomTypeId	,	@StartDT = dateadd(dd, min(dt.N), '20000101')	,	@EndDT = dateadd(dd, max(dt.N), '20000101')from	(	select	distinct			rd.HotelId		,	rd.RoomTypeId		,	n.N		,	grp = n.N - dense_rank() over(order by rd.HotelId, rd.RoomTypeId, n.N)	from #RoomDates as rd	join #Nums as n		on n.N between datediff(dd, '20000101', rd.StartDate) and datediff(dd, '20000101', rd.EndDate)	) as dtgroup by dt.HotelId, dt.RoomTypeId, dt.grp--order by dt.HotelId, dt.RoomTypeId, min(dt.N);SET STATISTICS TIME OFFPRINT '------ Mark''s query'SET STATISTICS TIME ON;WITH StartsAndEnds(StartEnd,StartDate,EndDate,HotelId,RoomTypeId) AS (SELECT 'S' AS StartEnd,       StartDate,       DATEADD(day,-1,StartDate),       HotelId,       RoomTypeIdFROM #RoomDatesUNION ALLSELECT 'E' AS StartEnd,       DATEADD(day,1,EndDate),       EndDate,       HotelId,       RoomTypeIdFROM #RoomDates),OrderedStarts AS (SELECT StartDate,       HotelId,       RoomTypeId,       ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId ORDER BY StartDate,StartEnd DESC) AS rnBoth,       2*(ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId,StartEnd ORDER BY StartDate))-1 AS rnStartEndFROM StartsAndEnds),OrderedEnds AS (SELECT EndDate,       HotelId,       RoomTypeId,       ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId ORDER BY EndDate DESC,StartEnd) AS rnBothRev,       2*(ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId,StartEnd ORDER BY EndDate DESC))-1 AS rnStartEndRevFROM StartsAndEnds),Starts AS (SELECT StartDate,       HotelId,       RoomTypeId,       ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId ORDER BY StartDate) AS rnFROM OrderedStartsWHERE rnBoth=rnStartEnd),Ends AS (SELECT EndDate,       HotelId,       RoomTypeId,       ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId ORDER BY EndDate) AS rnFROM OrderedEndsWHERE rnBothRev=rnStartEndRev)SELECT @HotelID=s.HotelId,@RoomTypeID=s.RoomTypeId,@SD=s.StartDate,@ED=e.EndDateFROM Starts sINNER JOIN Ends e ON e.HotelId=s.HotelId AND e.RoomTypeId=s.RoomTypeId AND e.rn=s.rn AND s.StartDate&amp;lt;=e.EndDate--ORDER BY s.HotelId,s.RoomTypeId,s.StartDate,e.EndDate;SET STATISTICS TIME OFFPRINT '------ Dwain''s query'SET STATISTICS TIME ON;WITH Tally (n) AS (        SELECT TOP (SELECT 1+MAX(DATEDIFF(day, StartDate, EndDate)) FROM #RoomDates)            ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1            FROM sys.all_columns),    GroupDates AS (        SELECT HotelID, RoomTypeID            ,[Date]            ,DateGroup = DATEADD(day                    , -ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY [Date]), [Date])        FROM #RoomDates        CROSS APPLY (            SELECT [Date]=DATEADD(day, n, StartDate)             FROM Tally             WHERE DATEADD(day, n, StartDate) BETWEEN StartDate AND EndDate            ) a        GROUP BY HotelID, RoomTypeID, [Date])SELECT @HotelID=HotelID, @RoomTypeID=RoomTypeID    ,@SD=MIN([Date])    ,@ED=MAX([Date])FROM GroupDatesGROUP BY HotelID, RoomTypeID, DateGroupSET STATISTICS TIME OFFPRINT '------ IBG''s query (built by Dwain.C for this case)'SET STATISTICS TIME ON;WITH C1 AS (        SELECT HotelID, RoomTypeID, ts=StartDate, Type=1, e=NULL            ,s=ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY StartDate)        FROM #RoomDates        UNION ALL        SELECT HotelID, RoomTypeID, ts=EndDate, Type=-1            ,e=ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY EndDate)            ,s=NULL        FROM #RoomDates),    C2 AS (        SELECT C1.*            ,se=ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY ts, Type DESC)        FROM C1),    C3 AS (        SELECT HotelID, RoomTypeID, ts            ,grpnm=FLOOR((ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY ts)-1) / 2 + 1)        FROM C2        WHERE COALESCE(s-(se-s)-1, (se-e)-e) = 0)SELECT @HotelID=HotelID, @RoomTypeID=RoomTypeID, @SD=MIN(ts), @ED=MAX(ts)FROM C3GROUP BY HotelID, RoomTypeID, grpnmSET STATISTICS TIME OFFPRINT '------ ChrisM''s query'SET STATISTICS TIME ONSELECT @HotelID=HotelId, @RoomTypeID=RoomTypeId, @SD = MIN(startdate), @ED = MAX(enddate)FROM (	SELECT 		HotelId, RoomTypeId, startdate, enddate,		Grouper = DATEADD(day, 			0-DENSE_RANK() OVER (PARTITION BY HotelId, RoomTypeId ORDER BY InDate), 			InDate) 	FROM #RoomDates	CROSS APPLY (		SELECT TOP(1+DATEDIFF(DAY,startdate,enddate)) 			InDate = DATEADD(day,				(ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1),				startdate) 		from sys.columns	) x) cGROUP BY HotelId, RoomTypeId, GrouperSET STATISTICS TIME OFFPRINT '------ bterraberry''s query (2)'SET STATISTICS TIME ON;with cteTemp as	(	select	HotelId		,	RoomTypeId		,	BeginDate = case when row_number() over(order by HotelId, RoomTypeId, theDate) - openCnt = 0 then theDate end		,	EndDate = case when row_number() over(order by HotelId, RoomTypeId, theDate) - closeCnt = 0 then theDate end	from		(		select	HotelId			,	RoomTypeId			,	theDate = StartDate			,	closeCnt = null			,	openCnt = (row_number() over(order by HotelId, RoomTypeId, StartDate) *2) - 1		from #RoomDates		union all		select	HotelId			,	RoomTypeId			,	theDate = EndDate			,	closeCnt = row_number() over(order by HotelId, RoomTypeId, EndDate) * 2			,	openCnt = null		from #RoomDates		) as dt	)select	@HotelID=dt.HotelId	,	@RoomTypeID=dt.RoomTypeId	,	@SD = min(dt.BeginDate)	,	@ED = min(dt.EndDate)from 	(	select	HotelId		,	RoomTypeId		,	BeginDate		,	EndDate		,	grpID = case when BeginDate is not null then row_number() over(order by HotelId, RoomTypeId, BeginDate)				else row_number() over(order by HotelId, RoomTypeId, EndDate) end	from cteTemp	where BeginDate is not null or EndDate is not null	) as dtgroup by dt.HotelId, dt.RoomTypeId, dt.grpIDorder by dt.HotelId, dt.RoomTypeId, dt.grpID;SET STATISTICS TIME OFF if object_id('tempdb.dbo.#RoomDates', 'U') is not null    drop table #RoomDates if object_id('tempdb.dbo.#Nums', 'U') is not null    drop table #Nums[/code]Shows that I must have done something right because this solution seems to win out:[code="plain"]------ bteraberry's query SQL Server Execution Times:   CPU time = 15834 ms,  elapsed time = 10126 ms.------ Mark's query SQL Server Execution Times:   CPU time = 7051 ms,  elapsed time = 4544 ms.------ Dwain's query SQL Server Execution Times:   CPU time = 17207 ms,  elapsed time = 22379 ms.------ IBG's query (built by Dwain.C for this case) SQL Server Execution Times:   CPU time = 718 ms,  elapsed time = 756 ms.------ ChrisM's query SQL Server Execution Times:   CPU time = 11169 ms,  elapsed time = 16211 ms.------ bterraberry's query (2) SQL Server Execution Times:   CPU time = 999 ms,  elapsed time = 1019 ms.[/code]Bteraberry's second solution comes a close second and it appears he's using elements of the IBG solution.[b]Edit:[/b] Note that IBG also suggested some indexing to improve his solution's speed, which I did not employ here.</description><pubDate>Sun, 30 Sep 2012 23:39:45 GMT</pubDate><dc:creator>dwain.c</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>[quote] That's certainly one way to do it but that typically results in an OR in the JOIN or WHERE clause which can cause performance problems.  With that thought in mind, I'll close the interval with a very high date such as 9999-12-30.  Notice that I don't use the last date available (9999-12-31) in SQL Server so that I can use the classic and easily SARGable .. [/quote]I don't think the performance hit is that bad today; I find I usually write "x_ date BETWEEN start_date AND COALESCE (end_date, CURRENT_TIMESTAMP)" since it preserves the truth. I cannot see the future, but I am still alive today :-)  But more than that, we have DATE data types today, so the old Sybase dialect half-open trick is not needed any more. The sensible, easily read BETWEEN is good now, after all the decades of dialect.</description><pubDate>Sun, 30 Sep 2012 11:36:23 GMT</pubDate><dc:creator>CELKO</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>[quote][b]CELKO (9/27/2012)[/b][hr]The occupancy_end_date has to be NULL-able to show the guests who are still here (basic temporal idiom in SQL)[/quote]That's certainly one way to do it but that typically results in an OR in the JOIN or WHERE clause which can cause performance problems.  With that thought in mind, I'll close the interval with a very high date such as 9999-12-30.  Notice that I don't use the last date available (9999-12-31) in SQL Server so that I can use the classic and easily SARGable [font="Arial Black"]AND EndDate &amp;gt;= CutoffDate AND EndDate &amp;lt; CutoffDate[/font] where "CutoffDate" is an "open" end of the interval to accomodate when EndDate has a non-midnight time involved.  It also allows for standard handling whether a time is involved or not.</description><pubDate>Sun, 30 Sep 2012 09:14:26 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>Thanks to all who have responded with their ideas. I've been away for a couple of days &amp; will be coding my solution this coming week.Jeff, I have worked through the examples in the link you provided - excellent stuff, thank you.I have a 'RoomType' table which I can use as my base table - just need to create a RoomTypeIntervals function to facilitate the CROSS APPLY &amp; see how it performs when compared with the other suggestions.</description><pubDate>Sun, 30 Sep 2012 01:30:09 GMT</pubDate><dc:creator>Phil Parkin</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>[quote][b]CELKO (9/27/2012)[/b][hr]First, let's fix your DDL. There is no such thing a “type_id” in RDBMS; read ISO-11170 or any book on data modeling. Would you say “blood_type_id”? Of course not! A type and an id are totally different kinds of attribute properties. But why put it in the skeleton since we do not need it for the problem? The occupancy_end_date has to be NULL-able to show the guests who are still here (basic temporal idiom in SQL).--snip[/quote]The system from which the data is taken deals with room types (single, double, family with view of wildebeest etc), not individual rooms. These are held in a master table which has an Id column.The table does not represent room occupancy: it represents room types at hotels which are on, or will be on, stop-sale. This is never an open interval.</description><pubDate>Sun, 30 Sep 2012 01:24:16 GMT</pubDate><dc:creator>Phil Parkin</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>Most of you know me.  If I thought a Tally Table would be the best thing for this problem, I'd be on it like white on rice.  The Tally Table and all of it's wonderful different renditions are not the way to go on something like this, IMHO.I'll also state that I've not done a performance comparision with any of the code offered, so far.  I can only state that I've done previous comparisons and I've found one bit of code that, at the time, blew away the competition.  That code comes from Itzik Ben-Gan and may be found at the following URL.  Like SSC, it requires simple spam-free membership but it's absolute worth the time to sign up.  Here's the URL for "Packing Date Intervals" such as what is being done on this thread.[url]http://www.solidq.com/sqj/Pages/2011-March-Issue/Packing-Intervals.aspx[/url]</description><pubDate>Fri, 28 Sep 2012 19:36:08 GMT</pubDate><dc:creator>Jeff Moden</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>The first solution I posted above (the first reply to Phil) is a clean "islands" style approach that I don't believe is improved upon by the other similar approaches.  However, after posting, I realized that the traditional islands strategy is not desirable in this scenario because of performance considerations.  When I got back from lunch I went to correct my solution and I saw that Mark-101232 had already hit on my line of thinking.  I believe his offering is clearly the best so far in this thread.The problem with any islands approach is that the requirement exists to join each record to N number of records in another table (whether virtual or actual), thus expanding the result set and then performing additional work to aggregate and reduce it back down.  Now in this case, it may seem irrelevant because:a) there is very little data we're dealing with; b) we are using the date data type, so each N represents a full day; andc) the date ranges are small.Suppose these three facts were different.  If we were dealing with a million records ... and a precision of a minute rather than a full day ... and if we had larger date ranges, the islands strategy would suffer greatly.  And the thing is, any join/apply to any other table is completely unnecessary to solve this problem.  A non-join approach will perform better both in small and large scales.Mark's solution may look confusing, but it's basically this:  if you take all the dates (both start and end) and put them together in order of date you have a sequence.  Think of a begin as being +1 and an end as a -1.  Thus, when you get back to 0 (as many ends as beginnings), there is the end of your island.  Now, we're not actually solving it that way because we don't need the mess of a running total, but that same logical approach can be replicated with window functions.The big advantage is that we don't have to multiple the number of records out to potentially huge counts and then do the sorts (which don't scale well) to aggregate back down to our result.  The precision of the date is entirely irrelevant so it will work equally as well with a datetime2 as a date.  Below I have included how I would write the query, although it is essentially the same logic employed in Mark's solution.[code="sql"]with cteTemp as	(	select	HotelId		,	RoomTypeId		,	BeginDate = case when row_number() over(order by HotelId, RoomTypeId, theDate) - openCnt = 0 then theDate end		,	EndDate = case when row_number() over(order by HotelId, RoomTypeId, theDate) - closeCnt = 0 then theDate end	from		(		select	HotelId			,	RoomTypeId			,	theDate = StartDate			,	closeCnt = null			,	openCnt = (row_number() over(order by HotelId, RoomTypeId, StartDate) *2) - 1		from #RoomDates		union all		select	HotelId			,	RoomTypeId			,	theDate = EndDate			,	closeCnt = row_number() over(order by HotelId, RoomTypeId, EndDate) * 2			,	openCnt = null		from #RoomDates		) as dt	)select	dt.HotelId	,	dt.RoomTypeId	,	BeginDate = min(dt.BeginDate)	,	EndDate = min(dt.EndDate)from 	(	select	HotelId		,	RoomTypeId		,	BeginDate		,	EndDate		,	grpID = case when BeginDate is not null then row_number() over(order by HotelId, RoomTypeId, BeginDate)				else row_number() over(order by HotelId, RoomTypeId, EndDate) end	from cteTemp	where BeginDate is not null or EndDate is not null	) as dtgroup by dt.HotelId, dt.RoomTypeId, dt.grpIDorder by dt.HotelId, dt.RoomTypeId, dt.grpID;[/code]</description><pubDate>Fri, 28 Sep 2012 13:41:44 GMT</pubDate><dc:creator>bteraberry</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>[quote][b]dwain.c (9/27/2012)[/b][hr]Phil,I think this is a pretty snappy query too.[code="sql"];WITH Tally (n) AS (        SELECT TOP (SELECT 1+MAX(DATEDIFF(day, StartDate, EndDate)) FROM #RoomDates)            ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1            FROM sys.all_columns),    GroupDates AS (        SELECT HotelID, RoomTypeID            ,[Date]            ,DateGroup = DATEADD(day                    , -ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY [Date]), [Date])        FROM #RoomDates        CROSS APPLY (            SELECT n, [Date]=DATEADD(day, n, StartDate)             FROM Tally             WHERE DATEADD(day, n, StartDate) BETWEEN StartDate AND EndDate) a        GROUP BY HotelID, RoomTypeID, [Date])SELECT HotelID, RoomTypeID    ,StartDate=MIN([Date])    ,EndDate=MAX([Date])FROM GroupDates[/code]This is the first time I've been able to successfully apply Jeff Moden's method for "Grouping Islands of Contiguous Dates" [url]http://www.sqlservercentral.com/articles/T-SQL/71550/[/url]!Not saying I fully understand it but at least I can go through the motions now. :-D[b]Edit:[/b] Eliminated an unnecessary CROSS APPLY.[/quote]Jeff's article was the inspiration for this effort too;[code="sql"]SELECT HotelId, RoomTypeId, startdate = MIN(startdate), enddate = MAX(enddate)FROM (	SELECT 		HotelId, RoomTypeId, startdate, enddate,		Grouper = DATEADD(day, 			0-DENSE_RANK() OVER (PARTITION BY HotelId, RoomTypeId ORDER BY InDate), 			InDate) 	FROM #RoomDates	CROSS APPLY (		SELECT TOP(1+DATEDIFF(DAY,startdate,enddate)) 			InDate = DATEADD(day,				(ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1),				startdate) 		from sys.columns	) x) cGROUP BY HotelId, RoomTypeId, Grouper[/code]</description><pubDate>Fri, 28 Sep 2012 07:21:31 GMT</pubDate><dc:creator>ChrisM@Work</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>First, let's fix your DDL. There is no such thing a “type_id” in RDBMS; read ISO-11170 or any book on data modeling. Would you say “blood_type_id”? Of course not! A type and an id are totally different kinds of attribute properties. But why put it in the skeleton since we do not need it for the problem? The occupancy_end_date has to be NULL-able to show the guests who are still here (basic temporal idiom in SQL) CREATE TABLE Occupancy  (hotel_id INTEGER NOT NULL,  occupancy_start_date DATE NOT NULL,  occupancy_end_date DATE,  CHECK (occupancy_start_date &amp;lt;= occupancy_end_date),  PRIMARY KEY (hotel_id, occupancy_start_date));INSERT INTO Occupancy  VALUES  (1, '2012-01-01', '2012-02-01'), (1, '2012-01-10', '2012-02-05'), (1, '2012-02-04', '2012-03-01'),  (2, '2012-01-01', '2012-02-01'), (2, '2012-01-10', '2012-01-20'),  (3, '2012-01-01', '2012-02-01'),  (3, '2012-03-10', '2012-03-20'); Can I assume you have a calendar table? There should be a julianized day count starting for the earliest date in the Calender ;WITH  X1 (hotel_id, cal_date)AS(SELECT DISTINCT O.hotel_id, C.cal_date   FROM Occupancy AS O, Calendar AS C  WHERE C.cal_date         BETWEEN O.occupancy_start_date           AND COALESCE (O.occupancy_end_date, CURRENT_TIMESTAMP)),C1 (cal_date, julian_date_nbr)AS(SELECT cal_date,    ROW_NUMBER() OVER (ORDER BY cal_date)   FROM Calendar),X2 (hotel_id, cal_date, stay_grp)AS(SELECT X1.hotel_id, X1.cal_date,       (C1.julian_date_nbr -        ROW_NUMBER()        OVER (PARTITION BY X1.hotel_id ORDER BY X1.cal_date))  FROM X1, C1WHERE X1.cal_date = C1.cal_date)SELECT X2.hotel_id, MIN(cal_date) AS arrival_date, MAX(cal_date) AS departure_date  FROM X2 GROUP BY X2.hotel_id, X2.stay_grp;</description><pubDate>Thu, 27 Sep 2012 19:36:13 GMT</pubDate><dc:creator>CELKO</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>Phil,I think this is a pretty snappy query too.[code="sql"];WITH Tally (n) AS (        SELECT TOP (SELECT 1+MAX(DATEDIFF(day, StartDate, EndDate)) FROM #RoomDates)            ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1            FROM sys.all_columns),    GroupDates AS (        SELECT HotelID, RoomTypeID            ,[Date]            ,DateGroup = DATEADD(day                    , -ROW_NUMBER() OVER (PARTITION BY HotelID, RoomTypeID ORDER BY [Date]), [Date])        FROM #RoomDates        CROSS APPLY (            SELECT n, [Date]=DATEADD(day, n, StartDate)             FROM Tally             WHERE DATEADD(day, n, StartDate) BETWEEN StartDate AND EndDate) a        GROUP BY HotelID, RoomTypeID, [Date])SELECT HotelID, RoomTypeID    ,StartDate=MIN([Date])    ,EndDate=MAX([Date])FROM GroupDates[/code]This is the first time I've been able to successfully apply Jeff Moden's method for "Grouping Islands of Contiguous Dates" [url]http://www.sqlservercentral.com/articles/T-SQL/71550/[/url]!Not saying I fully understand it but at least I can go through the motions now. :-D[b]Edit:[/b] Eliminated an unnecessary CROSS APPLY.</description><pubDate>Thu, 27 Sep 2012 18:55:29 GMT</pubDate><dc:creator>dwain.c</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>HiI thought that I would try something out of left field with this as a matter of interest.  I use Geometry a lot and thought it would be interesting to use it for something different.Anyway here's what I came up with ... it will not be as quick as others though[code="sql"]  if object_id('tempdb.dbo.#RoomDates', 'U') is not null    drop table #RoomDates  create table #RoomDates (    HotelId int not null    ,RoomTypeId int not null    ,StartDate date not null    ,EndDate date not null    ,dateGraph Geometry    ,constraint [PK_tempRoom] primary key clustered (      HotelId      ,RoomTypeId      ,StartDate      ) with (      pad_index = off      ,statistics_norecompute = off      ,ignore_dup_key = off      ,allow_row_locks = on      ,allow_page_locks = on      ) on [PRIMARY]    ) on [PRIMARY]        insert #RoomDates ( HotelId      ,RoomTypeId      ,StartDate      ,EndDate)    values (1,1,'2012-01-01', '2012-02-01'), 		   (1,1,'2012-01-10', '2012-02-05'), 		   (1,1,'2012-02-4', '2012-03-01'),   		   (1,1,'2012-03-10', '2012-03-13'),		   (1,1,'2012-03-11', '2012-03-20'),           (2,1,'2012-01-01', '2012-02-01'),            (2,1,'2012-01-10', '2012-01-20'),           (3,1,'2012-01-01', '2012-02-01'),            (1,1,'2012-02-27', '2012-03-05'),            (3,1,'2012-03-10', '2012-03-20')        update #RoomDates    set dateGraph = Geometry::STGeomFromText('LINESTRING (' + 		cast(cast(cast(startdate as datetime) as int) as varchar) + ' ' + cast(roomtypeid + hotelid as varchar) + ', ' +		cast(cast(cast(enddate as datetime) as int) as varchar) + ' ' + cast(roomtypeid + hotelid as varchar) + ')'		,0)		 ;with 	overlap as (	select a.hotelid, a.roomtypeid, a.startdate, a.enddate, b.enddate maxenddate	from #RoomDates a		left join #RoomDates b on			a.hotelid = b.hotelid and 			a.roomtypeid = b.roomtypeid and			a.startdate &amp;lt; b.startdate and			a.dategraph.STIntersects(b.dategraph) = 1	),	grouper as (		select row_number() over (order by hotelid, roomtypeid, startdate) groupid,			hotelid, roomtypeid, startdate, enddate		from overlap a		where maxenddate is null		union all		select g.groupid, g.hotelid, g.roomtypeid, o.startdate, o.enddate		from grouper g			inner join overlap o on g.enddate = o.maxenddate and g.hotelid = o.hotelid and g.roomtypeid = o.hotelid	),	merger as (		select hotelid, roomtypeid, min(startdate) startdate, max(enddate) enddate		from grouper		group by groupid, hotelid, roomtypeid		)select * from merger[/code]</description><pubDate>Thu, 27 Sep 2012 13:55:16 GMT</pubDate><dc:creator>mickyT</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>Yeah, Mark's is much better.  I went to lunch and realized that the much cleaner query had eluded me.</description><pubDate>Wed, 26 Sep 2012 17:07:13 GMT</pubDate><dc:creator>bteraberry</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>Here's a snappy query for you[code="sql"]WITH StartsAndEnds(StartEnd,StartDate,EndDate,HotelId,RoomTypeId) AS (SELECT 'S' AS StartEnd,       StartDate,       DATEADD(day,-1,StartDate),       HotelId,       RoomTypeIdFROM #RoomDatesUNION ALLSELECT 'E' AS StartEnd,       DATEADD(day,1,EndDate),       EndDate,       HotelId,       RoomTypeIdFROM #RoomDates),OrderedStarts AS (SELECT StartDate,       HotelId,       RoomTypeId,       ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId ORDER BY StartDate,StartEnd DESC) AS rnBoth,       2*(ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId,StartEnd ORDER BY StartDate))-1 AS rnStartEndFROM StartsAndEnds),OrderedEnds AS (SELECT EndDate,       HotelId,       RoomTypeId,       ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId ORDER BY EndDate DESC,StartEnd) AS rnBothRev,       2*(ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId,StartEnd ORDER BY EndDate DESC))-1 AS rnStartEndRevFROM StartsAndEnds),Starts AS (SELECT StartDate,       HotelId,       RoomTypeId,       ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId ORDER BY StartDate) AS rnFROM OrderedStartsWHERE rnBoth=rnStartEnd),Ends AS (SELECT EndDate,       HotelId,       RoomTypeId,       ROW_NUMBER() OVER(PARTITION BY HotelId,RoomTypeId ORDER BY EndDate) AS rnFROM OrderedEndsWHERE rnBothRev=rnStartEndRev)SELECT s.HotelId,s.RoomTypeId,s.StartDate,e.EndDateFROM Starts sINNER JOIN Ends e ON e.HotelId=s.HotelId AND e.RoomTypeId=s.RoomTypeId AND e.rn=s.rn AND s.StartDate&amp;lt;=e.EndDateORDER BY s.HotelId,s.RoomTypeId,s.StartDate,e.EndDate;[/code]</description><pubDate>Wed, 26 Sep 2012 15:24:11 GMT</pubDate><dc:creator>Mark-101232</dc:creator></item><item><title>RE: Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>This solution uses a numbers table in [master].[code="sql"]select	dt.HotelId	,	dt.RoomTypeId	,	BeginRange = dateadd(dd, min(dt.N), '20000101')	,	EndRange = dateadd(dd, max(dt.N), '20000101')from	(	select	distinct			rd.HotelId		,	rd.RoomTypeId		,	n.N		,	grp = n.N - dense_rank() over(order by rd.HotelId, rd.RoomTypeId, n.N)	from #RoomDates as rd	join [master].dbo.Nums as n		on n.N between datediff(dd, '20000101', rd.StartDate) and datediff(dd, '20000101', rd.EndDate)	) as dtgroup by dt.HotelId, dt.RoomTypeId, dt.grporder by dt.HotelId, dt.RoomTypeId, min(dt.N);[/code]</description><pubDate>Wed, 26 Sep 2012 13:55:46 GMT</pubDate><dc:creator>bteraberry</dc:creator></item><item><title>Consolidate Overlapping Date Periods</title><link>http://www.sqlservercentral.com/Forums/Topic1364849-392-1.aspx</link><description>Getting late in the day and I could use some support from those who are feeling fresh! :-)Suppose I have a table of from/to rooms in hotels a bit like this:[code="sql"]  if object_id('tempdb.dbo.#RoomDates', 'U') is not null    drop table #RoomDates  create table #RoomDates (    HotelId int not null    ,RoomTypeId int not null    ,StartDate date not null    ,EndDate date not null    ,constraint [PK_tempRoom] primary key clustered (      HotelId      ,RoomTypeId      ,StartDate      ) with (      pad_index = off      ,statistics_norecompute = off      ,ignore_dup_key = off      ,allow_row_locks = on      ,allow_page_locks = on      ) on [PRIMARY]    ) on [PRIMARY]            insert #RoomDates ( HotelId      ,RoomTypeId      ,StartDate      ,EndDate)    values (1,1,'2012-01-01', '2012-02-01'), (1,1,'2012-01-10', '2012-02-05'), (1,1,'2012-02-4', '2012-03-01'),           (2,1,'2012-01-01', '2012-02-01'), (2,1,'2012-01-10', '2012-01-20'),           (3,1,'2012-01-01', '2012-02-01'), (3,1,'2012-03-10', '2012-03-20')        select * from #RoomDates[/code]And I want to consolidate the data in the table so that any overlapping date periods are consolidated into one row. My desired results:[code="sql"]    select HotelId      ,RoomTypeId      ,StartDate      ,EndDate      from (values (1,1,'2012-01-01', '2012-03-01'),                   (2,1,'2012-01-01', '2012-02-01'),                    (3,1,'2012-01-01', '2012-02-01'),                    (3,1,'2012-03-10', '2012-03-20')) DesiredResults (HotelId, RoomTypeId, StartDate, EndDate)[/code]Can anyone write a nice snappy query to do it?Notes1) Non-overlapping date periods should not be consolidated. 2) Consolidation is to be performed at HotelId, RoomTypeId level.Thanks v much!</description><pubDate>Wed, 26 Sep 2012 11:18:03 GMT</pubDate><dc:creator>Phil Parkin</dc:creator></item></channel></rss>