Group data into 15 minute intervals

Question

Group data into 15 minute intervals

PSB

SSChampion

Points: 13195
More actions
April 8, 2015 at 9:13 am

#317018

Hi,
I want to group my data into 15 minute interval . Below is my sample data and desired result set.
Create TABLE #HalfHourlyIntervals
(
OrderDate DATETIME,
IRevenue FLOAT,
TRevenue FLOAT
)
INSERT INTO #HalfHourlyIntervals (OrderDate,IRevenue,TRevenue)
SELECT '2015-04-08 02:52:37.000',1200,0 UNION ALL
SELECT '2015-04-08 04:54:00.000',300, 0 UNION ALL
SELECT '2015-04-08 07:06:16.000',423, 0 UNION ALL
SELECT '2015-04-08 07:11:49.000',0, 0 UNION ALL
SELECT '2015-04-08 07:20:05.000',0, 0 UNION ALL
SELECT '2015-04-08 07:24:09.000',0, 0 UNION ALL
SELECT '2015-04-08 07:56:16.000',25, 25 UNION ALL
SELECT '2015-04-08 08:03:10.000',0, 0 UNION ALL
SELECT '2015-04-08 08:04:07.000',30, 0 UNION ALL
SELECT '2015-04-08 08:08:33.000',0, 0 UNION ALL
SELECT '2015-04-08 08:10:22.000',0, 0 UNION ALL
SELECT '2015-04-08 08:11:28.000',0, 0 UNION ALL
SELECT '2015-04-08 08:15:44.000',0, 0 UNION ALL
SELECT '2015-04-08 08:18:40.000',0, 0 UNION ALL
SELECT '2015-04-08 08:19:46.000',0, 0 UNION ALL
SELECT '2015-04-08 08:21:19.000',0, 0 UNION ALL
SELECT '2015-04-08 08:25:59.000',115, 0 UNION ALL
SELECT '2015-04-08 08:26:22.000',0, 0 UNION ALL
SELECT '2015-04-08 08:26:46.000' , 0 , 0 UNION ALL
SELECT '2015-04-08 08:29:05.000',0,0 UNION ALL
SELECT '2015-04-08 08:29:58.000',0,0
SELECT * FROM #HalfHourlyIntervals ORDER BY 1
DROP TABLE #HalfHourlyIntervals
--Desired result
SELECT '2015-04-08 12:00:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 12:30:00.000' AS OrderDate ,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 01:00:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 01:30:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 02:00:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 02:30:00.000' AS OrderDate,1200 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 03:00:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 03:30:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 04:00:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 04:30:00.000' AS OrderDate,300 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 05:00:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 05:30:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 06:00:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 06:30:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 07:00:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 07:30:00.000' AS OrderDate,25 AS IRevenue,25 AS TRevenue
UNION
SELECT '2015-04-08 08:00:00.000' AS OrderDate,145 AS IRevenue,0 AS TRevenue
UNION
SELECT '2015-04-08 08:30:00.000' AS OrderDate,0 AS IRevenue,0 AS TRevenue
Thanks,
PSB

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

Alan Burstein SSC Guru Points: 61152 More actions · Answer 1

You said 15 minute intervals but your example shows 30 min. To do 15 change @interval to 15. You could also dynamically calculate the start and end dates. This should be enough to get you what you need though.

DECLARE

@interval int = 30,

@start datetime = '2015-04-08 01:00:00.000',

@finish datetime = '2015-04-08 12:30:00.000';

WITH

E1(N) AS (SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1),

iTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E1 a, E1 b, E1 c),

dateRanges(s,f) AS

(

SELECT DATEADD(MINUTE, (N-1)*@interval, @start), DATEADD(MINUTE, N*@interval, @start)

FROM iTally

)

SELECT s, IRevenue = ISNULL(SUM(IRevenue),0), TRevenue = ISNULL(SUM(TRevenue),0)

FROM dateRanges dr

LEFT JOIN #HalfHourlyIntervals hhi ON OrderDate >= s AND OrderDate < f

WHERE s <= @finish

GROUP BY s

ORDER BY s;-- included for presentation; this is not needed to get the correct answer and will slow you down

Edit: Included the changes that Kevin suggested. Added a variable for the time and interval.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 2

Looks reasonable, Alan, but I think you need 2 changes:

1) The OP wanted 15 minute "buckets". Easy change obviously.

2) I would want to add in a GROUP BY clause because it is certainly conceivable that there be more than one record per 15 minute bucket. This should be straight-forward as well.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

PSB SSChampion Points: 13195 More actions · Answer 3

For the interval between 8-8:30 , it's showing multiple rows

2015-04-08 08:00:00.000300

2015-04-08 08:00:00.00000

2015-04-08 08:00:00.0001150

2015-04-08 08:00:00.00000

should be

2015-04-08 08:00:00.000,145 ,0

Bill Talada SSChampion Points: 11954 More actions · Answer 4

SELECT crdate, DATEDIFF(MINUTE,'19000101', crdate) / 15

FROM sysobjects

ORDER BY DATEDIFF(MINUTE,'19000101', crdate) / 15

WITH x AS

(

SELECT DATEDIFF(MINUTE,'19000101', crdate) / 15 as QuarterHour

FROM sysobjects

)

SELECT QuarterHour, COUNT(*)

FROM x

GROUP by QuarterHour

ORDER by QuarterHour

;

Alan Burstein SSC Guru Points: 61152 More actions · Answer 5

PSB (4/8/2015)
For the interval between 8-8:30 , it's showing multiple rows
2015-04-08 08:00:00.000300
2015-04-08 08:00:00.00000
2015-04-08 08:00:00.00000
2015-04-08 08:00:00.00000
2015-04-08 08:00:00.00000
2015-04-08 08:00:00.00000
2015-04-08 08:00:00.00000
2015-04-08 08:00:00.00000
2015-04-08 08:00:00.0001150
2015-04-08 08:00:00.00000
2015-04-08 08:00:00.00000
2015-04-08 08:00:00.00000
2015-04-08 08:00:00.00000
should be
2015-04-08 08:00:00.000,145 ,0

I edited my post. It should be fixed now.

Kevin: thanks - good catch. I set the interval as a variable because the OP said 15 min but his example was 30 min.

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 6

Bill I think your example, while slick, doesn't meet the OPs needs, which it to have all time buckets, even if they don't have matching records.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

PSB SSChampion Points: 13195 More actions · Answer 7

PSB

SSChampion

Points: 13195

April 8, 2015 at 11:04 am

#1789623

Thanks work perfectly .

Alan Burstein SSC Guru Points: 61152 More actions · Answer 8

PSB (4/8/2015)
Thanks work perfectly .

😎

"I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

-- Itzik Ben-Gan 2001

Dwain Camps SSC Guru Points: 86978 More actions · Answer 9

Just for fun, here's another way using the SQL 2012 LEAD function:

DECLARE @Interval INT = 30

,@StartDT DATETIME = '2015-04-08 01:00:00.000'

,@EndDT DATETIME = '2015-04-08 09:00:00.000';

SELECT Interval = b.Interval

,iRevenue = CASE a.Interval WHEN b.Interval THEN a.iRevenue ELSE 0 END

,TRevenue = CASE a.Interval WHEN b.Interval THEN a.tRevenue ELSE 0 END

FROM

(

SELECT Interval, iRevenue=SUM(iRevenue), TRevenue=SUM(tRevenue)

,NextInterval=LEAD(Interval, 1, Interval) OVER (ORDER BY Interval)

FROM

(

SELECT OrderDate, iRevenue, TRevenue

,Interval=DATEADD(minute, @Interval*(DATEDIFF(minute, 0, OrderDate)/@Interval), 0)

FROM

(

SELECT OrderDate, iRevenue, TRevenue

FROM #HalfHourlyIntervals

-- Remove either or both of the UNION ALLs

-- below if you just want intervals over period in data

UNION ALL

SELECT @StartDT, 0, 0

--UNION ALL

--SELECT @EndDT, 0, 0

) a

GROUP BY Interval

) a

CROSS APPLY

(

SELECT TOP (1+DATEDIFF(minute, a.Interval, a.NextInterval)/@Interval)

DATEADD(minute, @Interval*(ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1), a.Interval)

FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a1 (n)

CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a2 (n)

CROSS JOIN (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) a3 (n)

) b (Interval)

ORDER BY Interval;

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

The Dixie Flatline SSC Guru Points: 53354 More actions · Answer 10

Dangit, I read all the way down thinking about the VALUES clause and you beat me to it.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Jeff Moden SSC Guru Points: 1004683 More actions · Answer 11

Bill Talada (4/8/2015)
SELECT crdate, DATEDIFF(MINUTE,'19000101', crdate) / 15 FROM sysobjects ORDER BY DATEDIFF(MINUTE,'19000101', crdate) / 15 WITH x AS ( SELECT DATEDIFF(MINUTE,'19000101', crdate) / 15 as QuarterHour FROM sysobjects ) SELECT QuarterHour, COUNT(*) FROM x GROUP by QuarterHour ORDER by QuarterHour ;

That would be my simple preference, as well, Bill. And, it would be easy to join to a 15 minute date/time table to fill in missing quarter hours to boot. +1000

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Dwain Camps SSC Guru Points: 86978 More actions · Answer 12

A general question here to Jeff and Joe then. Let's hear your thoughts.

I've written on Calendar tables[/url] and how they can be used (An Easter SQL[/url]). Clearly they're a valuable utility table for most applications.

I've thought about "time slot" tables before. But where I get bogged down in that concept is where do you stop? Do you do hourly time slots? Or half-hour, quarter hour, etc.?

Obviously you could go with the lowest granularity (e.g., minutes or seconds) and construct indexes to cover the other cases.

So what's the suggestion for best practice and why? Or is this something you'd do that is unique for each database and its specific needs?

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

Jeff Moden SSC Guru Points: 1004683 More actions · Answer 13

dwain.c (4/9/2015)
A general question here to Jeff and Joe then. Let's hear your thoughts.
I've written on Calendar tables[/url] and how they can be used (An Easter SQL[/url]). Clearly they're a valuable utility table for most applications.
I've thought about "time slot" tables before. But where I get bogged down in that concept is where do you stop? Do you do hourly time slots? Or half-hour, quarter hour, etc.?
Obviously you could go with the lowest granularity (e.g., minutes or seconds) and construct indexes to cover the other cases.
So what's the suggestion for best practice and why? Or is this something you'd do that is unique for each database and its specific needs?

Let's go back to Tally Tables first. I keep just 1 physical Tally Table with 11,000 rows in it (just a bit more than 30 years worth of days if I need it). What do you and I both do if we need more or if we don't actually know how many we'll need at a given time? We use Itzik's cascading CTE (cCTE) method, which also turns out to have the advantage of being read-less if not a tiny bit slower than a properly built Tally Table.

Getting back to you question about time intervals, why would you need to do any different? It'll be just a little slower than a physical table and it will always be just the right size.

In a pinch, you can have the advantage of the speed of a physical table along with JIT (Just In Time) right sizing by making a physical table as a Temp Table on the fly. Of course, you'll have reads to contend with but those reads will likely be high speed logical reads direct from memory if the Temp Table fits in memory.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Dwain Camps SSC Guru Points: 86978 More actions · Answer 14

Jeff Moden (4/9/2015)
dwain.c (4/9/2015)
A general question here to Jeff and Joe then. Let's hear your thoughts.
I've written on Calendar tables[/url] and how they can be used (An Easter SQL[/url]). Clearly they're a valuable utility table for most applications.
I've thought about "time slot" tables before. But where I get bogged down in that concept is where do you stop? Do you do hourly time slots? Or half-hour, quarter hour, etc.?
Obviously you could go with the lowest granularity (e.g., minutes or seconds) and construct indexes to cover the other cases.
So what's the suggestion for best practice and why? Or is this something you'd do that is unique for each database and its specific needs?
Let's go back to Tally Tables first. I keep just 1 physical Tally Table with 11,000 rows in it (just a bit more than 30 years worth of days if I need it). What do you and I both do if we need more or if we don't actually know how many we'll need at a given time? We use Itzik's cascading CTE (cCTE) method, which also turns out to have the advantage of being read-less if not a tiny bit slower than a properly built Tally Table.
Getting back to you question about time intervals, why would you need to do any different? It'll be just a little slower than a physical table and it will always be just the right size.
In a pinch, you can have the advantage of the speed of a physical table along with JIT (Just In Time) right sizing by making a physical table as a Temp Table on the fly. Of course, you'll have reads to contend with but those reads will likely be high speed logical reads direct from memory if the Temp Table fits in memory.

I must confess to some surprise that your permanent Tally table only contains 11,000 rows (I was thinking it was probably 1M). Aside from the undoubtedly magic number and possible heresy in that, I think it is pretty clear what your advice is on this. Build 'em as you need 'em.

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St