Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««1234»»

Grouping records by time interval Expand / Collapse
Author
Message
Posted Tuesday, March 5, 2013 2:58 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, March 25, 2014 7:47 AM
Points: 24, Visits: 141
ChrisM@Work (3/4/2013)
With the added cost of Event_Channel in the matrix table correlating with the inner query, it may be worthwhile splitting the query up, like this:

SELECT *
INTO #Matrix
FROM Tijdsintervallen
CROSS JOIN (
SELECT DISTINCT Ticket_Closed_Date = CAST(Ticket_Closed_DateTime AS DATE)
FROM Tussen_Tickets
) d
CROSS JOIN (
SELECT DISTINCT Event_Channel
FROM Tussen_Tickets
) e

CREATE UNIQUE CLUSTERED INDEX ucx_Everything ON #Matrix (Event_Channel, Ticket_Closed_Date, beginTijdsinterval)
-----------------------------------------------------

SELECT mx.Ticket_Closed_Date, mx.Interval, op.*
FROM #Matrix mx
OUTER APPLY (
SELECT tt.Event_Channel,
DVBIntern = SUM(tt.DVBIntern),
DVBExtern = SUM(tt.DVBExtern),
Tijdsduur = SUM(tt.Tijdsduur),
WrapUp = SUM(tt.WrapUp),
OnHold = SUM(tt.OnHold),
Talk = SUM(tt.Talk),
RowsAggregated = COUNT(*) -- not in spec but handy to have
FROM Tussen_Tickets tt
CROSS APPLY (
SELECT
Ticket_Closed_Time = CAST(tt.Ticket_Closed_DateTime AS TIME(3)),
Ticket_Closed_Date = CAST(tt.Ticket_Closed_DateTime AS DATE)
) ta
WHERE ta.Ticket_Closed_Time BETWEEN mx.beginTijdsinterval AND mx.eindTijdsinterval
AND ta.Ticket_Closed_Date = mx.Ticket_Closed_Date
AND tt.Event_Channel = mx.Event_Channel
GROUP BY ta.Ticket_Closed_Date, tt.Event_Channel
) op
ORDER BY mx.Ticket_Closed_Date, mx.Interval, op.Event_Channel;



Hi ChrisM@Work,

Thank you for your help. I did run this query and it generated 1878912 records. Taking a close look, everything is correct but if the rows are null they are repeated several times. In attachment you find the result in a zipped text file.
By the way, with the index on the Tussen_Tickets table this query runs in a little more than 1 minute. That's a large improvement.

Greetz,

Geert


  Post Attachments 
Result.zip (0 views, 4.96 MB)
Post #1426670
Posted Wednesday, March 6, 2013 12:33 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, October 17, 2014 10:43 PM
Points: 1, Visits: 199
I think the window functions should be of help here.
Please repost the data I am not able to get the data you have posted.
Post #1427208
Posted Wednesday, March 6, 2013 1:32 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 1:18 AM
Points: 6,799, Visits: 14,013
geert.de.vylder (3/5/2013)
[quote][b]... if the rows are null they are repeated several times....


There will be many many rows in the matrix table which don't have a matching ticket, hence null values in columns sourced from the ticket table. However, there shouldn't be any duplicate rows at all. Have you tested for this?

You may get another performance lift from preprocessing the ticket table:

SELECT *
INTO #Matrix
FROM Tijdsintervallen
CROSS JOIN (
SELECT DISTINCT Ticket_Closed_Date = CAST(Ticket_Closed_DateTime AS DATE)
FROM Tussen_Tickets
) d
CROSS JOIN (
SELECT DISTINCT Event_Channel
FROM Tussen_Tickets
) e

CREATE UNIQUE CLUSTERED INDEX ucx_Everything ON #Matrix (Event_Channel, Ticket_Closed_Date, beginTijdsinterval)
-----------------------------------------------------

SELECT
tt.Event_Channel,
ta.Ticket_Closed_Date,
ta.beginTijdsinterval,
DVBIntern = SUM(tt.DVBIntern),
DVBExtern = SUM(tt.DVBExtern),
Tijdsduur = SUM(tt.Tijdsduur),
WrapUp = SUM(tt.WrapUp),
OnHold = SUM(tt.OnHold),
Talk = SUM(tt.Talk),
RowsAggregated = COUNT(*) -- not in spec but handy to have
INTO #Tussen_Tickets
FROM Tussen_Tickets tt
CROSS APPLY (
SELECT
beginTijdsinterval = CAST(DATEADD(MINUTE,0-(DATEDIFF(MINUTE,0,tt.Ticket_Closed_DateTime)%15),tt.Ticket_Closed_DateTime) AS TIME(3)),
Ticket_Closed_Date = CAST(tt.Ticket_Closed_DateTime AS DATE)
) ta
GROUP BY tt.Event_Channel, ta.Ticket_Closed_Date, ta.beginTijdsinterval

CREATE UNIQUE CLUSTERED INDEX ucx_Tussen_Tickets_NearlyEverything ON #Tussen_Tickets (Event_Channel, Ticket_Closed_Date, beginTijdsinterval)

---------------------------------------

SELECT mx.Ticket_Closed_Date, mx.Event_Channel, mx.Interval,
tt.DVBIntern, tt.DVBExtern, tt.Tijdsduur, tt.WrapUp, tt.OnHold, tt.Talk, tt.RowsAggregated
FROM #Matrix mx
LEFT JOIN #Tussen_Tickets tt
ON tt.Event_Channel = mx.Event_Channel
AND tt.Ticket_Closed_Date = mx.Ticket_Closed_Date
AND tt.beginTijdsinterval = mx.beginTijdsinterval
ORDER BY mx.Ticket_Closed_Date, mx.Event_Channel, mx.Interval;
-- 11520 rows



“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1427230
Posted Wednesday, March 6, 2013 1:35 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 1:18 AM
Points: 6,799, Visits: 14,013
b ghanekar (3/6/2013)
I think the window functions should be of help here.
Please repost the data I am not able to get the data you have posted.


The fourth post on this thread has a .txt attachment which contains the ddl and dml and it works just fine. I'd be very interested to see how window functions might be used in a solution.


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1427232
Posted Wednesday, March 6, 2013 2:15 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, March 25, 2014 7:47 AM
Points: 24, Visits: 141
Hi all,

I adjusted the TestTables file. I included the complete Tussen_Tickets data which is actualy in our table. And I adjusted some other things.
Hope this helps.
You will find it in attachment.

Greetz,

Geert


  Post Attachments 
TestTables.txt (1 view, 2.76 MB)
Post #1427243
Posted Wednesday, March 6, 2013 4:35 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 1:18 AM
Points: 6,799, Visits: 14,013
geert.de.vylder (3/6/2013)
Hi all,

I adjusted the TestTables file. I included the complete Tussen_Tickets data which is actualy in our table. And I adjusted some other things.
Hope this helps.
You will find it in attachment.

Greetz,

Geert


There's an error in the ddl for the interval table - check the PK.
When providing a script for loading data, always check it first. I know from experience that
INSERT INTO ...
SELECT ... UNION ALL
throws a wobbly on anything more than say 10,000 rows. Break it up into sections of a couple of thousand rows.

Try this:
-------------------
DROP TABLE #Matrix

SELECT *
INTO #Matrix
FROM Tijdsintervallen
CROSS JOIN (
SELECT DISTINCT Ticket_Closed_Date = CAST(Ticket_Closed_DateTime AS DATE)
FROM Tussen_Tickets
WHERE Ticket_Closed_DateTime IS NOT NULL
) d
CROSS JOIN (
SELECT DISTINCT Event_Channel
FROM Tussen_Tickets
) e
-- (1,180,224 row(s) affected) / 00:00:04

CREATE UNIQUE CLUSTERED INDEX ucx_Everything ON #Matrix (Event_Channel, Ticket_Closed_Date, beginTijdsinterval)
-----------------------------------------------------
-- there's a row in Tussen_Tickets where Ticket_Closed_DateTime is NULL:
DROP TABLE #Tussen_Tickets
SELECT
tt.Event_Channel,
ta.Ticket_Closed_Date,
ta.beginTijdsinterval,
DVBIntern = SUM(tt.DVBIntern),
DVBExtern = SUM(tt.DVBExtern),
Tijdsduur = SUM(tt.Tijdsduur),
WrapUp = SUM(tt.WrapUp),
OnHold = SUM(tt.OnHold),
Talk = SUM(tt.Talk),
RowsAggregated = COUNT(*) -- not in spec but handy to have
INTO #Tussen_Tickets
FROM Tussen_Tickets tt
CROSS APPLY (
SELECT
beginTijdsinterval = CAST(DATEADD(MINUTE,0-(DATEDIFF(MINUTE,0,tt.Ticket_Closed_DateTime)%15),tt.Ticket_Closed_DateTime) AS TIME(3)),
Ticket_Closed_Date = CAST(tt.Ticket_Closed_DateTime AS DATE)
) ta
WHERE tt.Ticket_Closed_DateTime IS NOT NULL
GROUP BY tt.Event_Channel, ta.Ticket_Closed_Date, ta.beginTijdsinterval
-- (24,740 row(s) affected) / 00:00:01

CREATE UNIQUE CLUSTERED INDEX ucx_Tussen_Tickets_NearlyEverything ON #Tussen_Tickets (Event_Channel, Ticket_Closed_Date, beginTijdsinterval)

---------------------------------------

SELECT mx.Ticket_Closed_Date, mx.Event_Channel, mx.Interval,
tt.DVBIntern, tt.DVBExtern, tt.Tijdsduur, tt.WrapUp, tt.OnHold, tt.Talk, tt.RowsAggregated
FROM #Matrix mx
LEFT JOIN #Tussen_Tickets tt
ON tt.Event_Channel = mx.Event_Channel
AND tt.Ticket_Closed_Date = mx.Ticket_Closed_Date
AND tt.beginTijdsinterval = mx.beginTijdsinterval
ORDER BY mx.Ticket_Closed_Date, mx.Event_Channel, mx.Interval;
-- (1,180,224 row(s) affected) / 00:00:17 (time is network + display)



“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1427301
Posted Wednesday, March 6, 2013 6:44 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, March 25, 2014 7:47 AM
Points: 24, Visits: 141
ChrisM@Work (3/6/2013)


There's an error in the ddl for the interval table - check the PK.
When providing a script for loading data, always check it first. I know from experience that
INSERT INTO ...
SELECT ... UNION ALL
throws a wobbly on anything more than say 10,000 rows. Break it up into sections of a couple of thousand rows.

Try this:
-------------------
DROP TABLE #Matrix

SELECT *
INTO #Matrix
FROM Tijdsintervallen
CROSS JOIN (
SELECT DISTINCT Ticket_Closed_Date = CAST(Ticket_Closed_DateTime AS DATE)
FROM Tussen_Tickets
WHERE Ticket_Closed_DateTime IS NOT NULL
) d
CROSS JOIN (
SELECT DISTINCT Event_Channel
FROM Tussen_Tickets
) e
-- (1,180,224 row(s) affected) / 00:00:04

CREATE UNIQUE CLUSTERED INDEX ucx_Everything ON #Matrix (Event_Channel, Ticket_Closed_Date, beginTijdsinterval)
-----------------------------------------------------
-- there's a row in Tussen_Tickets where Ticket_Closed_DateTime is NULL:
DROP TABLE #Tussen_Tickets
SELECT
tt.Event_Channel,
ta.Ticket_Closed_Date,
ta.beginTijdsinterval,
DVBIntern = SUM(tt.DVBIntern),
DVBExtern = SUM(tt.DVBExtern),
Tijdsduur = SUM(tt.Tijdsduur),
WrapUp = SUM(tt.WrapUp),
OnHold = SUM(tt.OnHold),
Talk = SUM(tt.Talk),
RowsAggregated = COUNT(*) -- not in spec but handy to have
INTO #Tussen_Tickets
FROM Tussen_Tickets tt
CROSS APPLY (
SELECT
beginTijdsinterval = CAST(DATEADD(MINUTE,0-(DATEDIFF(MINUTE,0,tt.Ticket_Closed_DateTime)%15),tt.Ticket_Closed_DateTime) AS TIME(3)),
Ticket_Closed_Date = CAST(tt.Ticket_Closed_DateTime AS DATE)
) ta
WHERE tt.Ticket_Closed_DateTime IS NOT NULL
GROUP BY tt.Event_Channel, ta.Ticket_Closed_Date, ta.beginTijdsinterval
-- (24,740 row(s) affected) / 00:00:01

CREATE UNIQUE CLUSTERED INDEX ucx_Tussen_Tickets_NearlyEverything ON #Tussen_Tickets (Event_Channel, Ticket_Closed_Date, beginTijdsinterval)

---------------------------------------

SELECT mx.Ticket_Closed_Date, mx.Event_Channel, mx.Interval,
tt.DVBIntern, tt.DVBExtern, tt.Tijdsduur, tt.WrapUp, tt.OnHold, tt.Talk, tt.RowsAggregated
FROM #Matrix mx
LEFT JOIN #Tussen_Tickets tt
ON tt.Event_Channel = mx.Event_Channel
AND tt.Ticket_Closed_Date = mx.Ticket_Closed_Date
AND tt.beginTijdsinterval = mx.beginTijdsinterval
ORDER BY mx.Ticket_Closed_Date, mx.Event_Channel, mx.Interval;
-- (1,180,224 row(s) affected) / 00:00:17 (time is network + display)



Hi ChrisM@Work,

Thank you for the new query. The rows affected are as you mention. But the columns DVBIntern, DVBExtern, Tijdsduur, WrapUp, OnHold and Talk are all NULL or zero.

Greetz,

Geert
Post #1427356
Posted Wednesday, March 6, 2013 7:35 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 1:18 AM
Points: 6,799, Visits: 14,013
geert.de.vylder (3/6/2013)
...But the columns DVBIntern, DVBExtern, Tijdsduur, WrapUp, OnHold and Talk are all NULL or zero...


That sounds about right. The matrix table contains 1,180,224 rows (all dates and event channels found in the tickets table, all intervals ). The aggregated tickets table only contains 24,740 rows, and not much data;

SELECT
DVBIntern = SUM(tt.DVBIntern),
DVBExtern = SUM(tt.DVBExtern),
Tijdsduur = SUM(tt.Tijdsduur),
WrapUp = SUM(tt.WrapUp),
OnHold = SUM(tt.OnHold),
Talk = SUM(tt.Talk)
FROM Tussen_Tickets tt


DVBIntern DVBExtern Tijdsduur WrapUp OnHold Talk
0.........0.........270164....2.567..0......270161.433


This data set obviously isn't what you were expecting to see, even though it exactly meets your requirements. What would you like to do with it?


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1427399
Posted Wednesday, March 6, 2013 8:01 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Tuesday, March 25, 2014 7:47 AM
Points: 24, Visits: 141
ChrisM@Work (3/6/2013)


This data set obviously isn't what you were expecting to see, even though it exactly meets your requirements. What would you like to do with it?


I have to update the data set into another database on a local server in development and on another remote server in production. The other database is used for reporting.

Greetz,

Geert
Post #1427432
Posted Wednesday, March 6, 2013 8:24 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Today @ 1:18 AM
Points: 6,799, Visits: 14,013
geert.de.vylder (3/6/2013)
ChrisM@Work (3/6/2013)


This data set obviously isn't what you were expecting to see, even though it exactly meets your requirements. What would you like to do with it?


I have to update the data set into another database on a local server in development and on another remote server in production. The other database is used for reporting.

Greetz,

Geert

I meant in terms of potentially reducing the rowcount.
Let's call a set from this table containing the same day, all of the intervals, and the same Event_Channel, a frame. One frame contains 96 rows, one row per interval. There are 9 event channels so there are 864 rows per date. Multiply 864 by the number of dates in the tickets table - 1366 - to yield the number of rows in the matrix table.
One possibility is to remove frames which don't have any corresponding rows in the tickets table.


“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Exploring Recursive CTEs by Example Dwain Camps
Post #1427454
« Prev Topic | Next Topic »

Add to briefcase «««1234»»

Permissions Expand / Collapse