SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


CTE Crazy: Sums, Counts and Pivots I'm missing something here


CTE Crazy: Sums, Counts and Pivots I'm missing something here

Author
Message
Tom257
Tom257
SSCrazy Eights
SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)

Group: General Forum Members
Points: 9975 Visits: 1503
I've not had to use SQL for a while and I'm having trouble getting to grips with aggregates again.

In the code below I think its This line that is not doing what I expect

SUM(S.CallCount) OVER(PARTITION BY S.CallTypeId, S.CallDate) AS NumCalls


I want to get a report of counts by Day and Type - with the added factor of ignoring some items in the count - the sample data is much simplified to show the specific issue I want to solve.
Here is a Sample table

-- DROP TABLE #TmpData;
CREATE TABLE #TmpData (
CallDateTime DateTime,
Duration Time,
CallTypeId Int )
INSERT INTO #TmpData
SELECT '2017-08-01 10:14:37.00', '00:00:03.450', 1 UNION ALL
SELECT '2017-08-01 10:55:46.00', NULL, 2 UNION ALL
SELECT '2017-08-01 11:30:50.00', NULL, 2 UNION ALL
SELECT '2017-08-01 12:48:02.00', '00:00:02.500', 1 UNION ALL
SELECT '2017-08-01 14:48:24.00', '00:00:02.910', 1 UNION ALL
SELECT '2017-08-01 17:05:26.00', '00:00:04.460', 1 UNION ALL
SELECT '2017-08-04 02:11:47.00', '00:13:54.790', 4 UNION ALL
SELECT '2017-08-06 01:29:13.00', '00:01:18.180', 5 UNION ALL
SELECT '2017-08-06 04:33:33.00', NULL, 3 UNION ALL
SELECT '2017-08-06 04:33:45.00', '00:00:00.000', 5 UNION ALL
SELECT '2017-08-06 04:33:49.00', NULL, 3 UNION ALL
SELECT '2017-08-06 04:35:59.00', NULL, 3 UNION ALL
SELECT '2017-08-06 05:08:22.00', '00:00:00.000', 5 UNION ALL
SELECT '2017-08-06 20:51:28.00', '00:00:03.310', 4 UNION ALL
SELECT '2017-08-08 15:49:27.00', '00:00:19.000', 5 UNION ALL
SELECT '2017-08-10 15:24:19.00', '00:04:32.790', 4 UNION ALL
SELECT '2017-08-10 18:12:30.00', NULL, 2 UNION ALL
SELECT '2017-08-10 18:13:01.00', NULL, 3 UNION ALL
SELECT '2017-08-10 18:40:06.00', NULL, 2 UNION ALL
SELECT '2017-08-10 22:35:13.00', '00:01:33.710', 4 ;


Here is my report so far :
Its wrong! it produces counts of 1 everywhere there should be a count, but some of those 1's should be true counts e.g. Column [1] on the first date [2017-08-01 should be 4
And it seems far too complicated - (I've gone CTE crazy here) - in addition to fixing the counts there has got to be a simpler way



;WITH SplitDates AS (
SELECT CAST(N.CallDateTime AS date) AS CallDate,
DATEDIFF(MILLISECOND, '1900-01-01',ISNULL(N.Duration,'00:00:00')) AS [DurationMs],
N.CallTypeId,
-- Rules - some circumstances are not counted
CASE WHEN N.CallTypeID = 4 AND ISNULL(N.Duration,'00:00:00.000') = '00:00:00.000' THEN 0
WHEN N.CallTypeID = 5 AND ISNULL(N.Duration,'00:00:00.000') = '00:00:00.000' THEN 0
ELSE 1
END AS CallCount
FROM #TmpData AS N
), DurationByDay AS (
SELECT CallDate, SUM(DurationMs) AS [Total Duration]
FROM SplitDates
GROUP BY CallDate
), SplitDays AS (
SELECT S.CallTypeId,
SUM(S.CallCount) OVER(PARTITION BY S.CallTypeId, S.CallDate) AS NumCalls,
S.CallDate
FROM SplitDates AS S
GROUP BY S.CallDate, S.CallTypeId, S.CallCount
), PivotNumbers AS (
SELECT CallDate, [1], [2], [3], [4], [5] , [Total Duration] -- TODO Convert back to HH:mm:ss here
FROM ( SELECT CallTypeId, T.CallDate, NumCalls, D.[Total Duration]
FROM SplitDays AS T
JOIN DurationByDay AS D ON D.CallDate = T.CallDate
) AS SourceTable
PIVOT ( SUM(NumCalls)
FOR CallTypeId IN ([1], [2], [3], [4], [5]) ) AS PivotTable
)
SELECT *
FROM PivotNumbers
ORDER BY CallDate

Tom257
Tom257
SSCrazy Eights
SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)

Group: General Forum Members
Points: 9975 Visits: 1503
OK so preparing this data for SSC seems to have stimuated a few grey cells

I've fixed the output it - as follows
However - can it be further simplified?


;WITH SplitDates AS (
SELECT CAST(N.CallDateTime AS date) AS CallDate,
DATEDIFF(MILLISECOND, '1900-01-01',ISNULL(N.Duration,'00:00:00')) AS [DurationMs],
N.CallTypeId,
-- Rules - some not counted
CASE WHEN N.CallTypeID = 4 AND ISNULL(N.Duration,'00:00:00.000') = '00:00:00.000' THEN 0
WHEN N.CallTypeID = 5 AND ISNULL(N.Duration,'00:00:00.000') = '00:00:00.000' THEN 0
ELSE 1
END AS CallCount
FROM #TmpData AS N
), SumCallsPerTypePerDay AS (
SELECT SUM(CallCount) AS NumCalls,
CallDate,
CallTypeId
FROM SplitDates
GROUP BY CallDate, CallTypeId
), DurationByDay AS (
SELECT CallDate, SUM(DurationMs) AS [Total Duration]
FROM SplitDates
GROUP BY CallDate
), PivotNumbers AS (
SELECT CallDate, [1], [2], [3], [4], [5] , [Total Duration]
FROM ( SELECT CallTypeId, T.CallDate, NumCalls, D.[Total Duration]
FROM SumCallsPerTypePerDay AS T
JOIN DurationByDay AS D ON D.CallDate = T.CallDate
) AS SourceTable
PIVOT ( SUM(NumCalls)
FOR CallTypeId IN ([1], [2], [3], [4], [5]) ) AS PivotTable
)
SELECT *
FROM PivotNumbers
ORDER BY CallDate

Luis Cazares
Luis Cazares
SSC Guru
SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)

Group: General Forum Members
Points: 149218 Visits: 22100
Your query looks so overwhelming when it can be so simple:

SELECT CAST( td.CallDateTime AS DATE),
COUNT(CASE WHEN td.CallTypeId = 1 THEN 1 END) AS [1],
COUNT(CASE WHEN td.CallTypeId = 2 THEN 1 END) AS [2],
COUNT(CASE WHEN td.CallTypeId = 3 THEN 1 END) AS [3],
COUNT(CASE WHEN td.CallTypeId = 4 AND Duration <> '00:00:00.000' THEN 1 END) AS [4],
COUNT(CASE WHEN td.CallTypeId = 5 AND Duration <> '00:00:00.000' THEN 1 END) AS [5],
SUM(DATEDIFF(MILLISECOND, 0,ISNULL(Duration,'00:00:00')))
FROM #TmpData AS td
GROUP BY CAST( td.CallDateTime AS DATE);


EDIT: I made a mistake on the Types 4 and 5. I have corrected it and made it even simpler.



Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Tom257
Tom257
SSCrazy Eights
SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)

Group: General Forum Members
Points: 9975 Visits: 1503
@Luis Cazares
Thanks for this simplification - I have several of these reports to do, and this will help me get back into the SQL mindset.
drew.allen
drew.allen
SSC Guru
SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)

Group: General Forum Members
Points: 57128 Visits: 14834
Luis Cazares - Wednesday, February 28, 2018 7:41 AM
Your query looks so overwhelming when it can be so simple:

SELECT CAST( td.CallDateTime AS DATE),
COUNT(CASE WHEN td.CallTypeId = 1 THEN 1 END) AS [1],
COUNT(CASE WHEN td.CallTypeId = 2 THEN 1 END) AS [2],
COUNT(CASE WHEN td.CallTypeId = 3 THEN 1 END) AS [3],
COUNT(CASE WHEN td.CallTypeId = 4 AND Duration <> '00:00:00.000' THEN 1 END) AS [4],
COUNT(CASE WHEN td.CallTypeId = 5 AND Duration <> '00:00:00.000' THEN 1 END) AS [5],
SUM(DATEDIFF(MILLISECOND, 0,ISNULL(Duration,'00:00:00')))
FROM #TmpData AS td
GROUP BY CAST( td.CallDateTime AS DATE);


EDIT: I made a mistake on the Types 4 and 5. I have corrected it and made it even simpler.

You can make it even simpler. Null values are ignored in SUM, so there is no reason to convert it to the additive identity, i.e., zero.


SELECT CAST( td.CallDateTime AS DATE),
COUNT(CASE WHEN td.CallTypeId = 1 THEN 1 END) AS [1],
COUNT(CASE WHEN td.CallTypeId = 2 THEN 1 END) AS [2],
COUNT(CASE WHEN td.CallTypeId = 3 THEN 1 END) AS [3],
COUNT(CASE WHEN td.CallTypeId = 4 AND Duration <> '00:00:00.000' THEN 1 END) AS [4],
COUNT(CASE WHEN td.CallTypeId = 5 AND Duration <> '00:00:00.000' THEN 1 END) AS [5],
SUM(DATEDIFF(MILLISECOND, 0,Duration))
FROM #TmpData AS td
GROUP BY CAST( td.CallDateTime AS DATE);


Drew


J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
How to post data/code on a forum to get the best help.
How to Post Performance Problems
Make sure that you include code in the appropriate IFCode tags, e.g. [code=sql]<your code here>[/code]. You can find the IFCode tags under the INSERT options when you are writing a post.
drew.allen
drew.allen
SSC Guru
SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)SSC Guru (57K reputation)

Group: General Forum Members
Points: 57128 Visits: 14834
Part of the problem you were running into was your choice of using PIVOT instead of the more flexible CROSS TAB, which Luis used.
  • PIVOT is limited to aggregating a single attribute, but you want to aggregate TWO attributes (CallCount and Duration).
  • PIVOT can only reference one expression for the column definitions, but you needed to reference two (CallType and Duration).
I tend not to use PIVOT at all, because of these limitations.
Drew


J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
How to post data/code on a forum to get the best help.
How to Post Performance Problems
Make sure that you include code in the appropriate IFCode tags, e.g. [code=sql]<your code here>[/code]. You can find the IFCode tags under the INSERT options when you are writing a post.
Luis Cazares
Luis Cazares
SSC Guru
SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)

Group: General Forum Members
Points: 149218 Visits: 22100
drew.allen - Wednesday, February 28, 2018 8:42 AM
Luis Cazares - Wednesday, February 28, 2018 7:41 AM
Your query looks so overwhelming when it can be so simple:

SELECT CAST( td.CallDateTime AS DATE),
COUNT(CASE WHEN td.CallTypeId = 1 THEN 1 END) AS [1],
COUNT(CASE WHEN td.CallTypeId = 2 THEN 1 END) AS [2],
COUNT(CASE WHEN td.CallTypeId = 3 THEN 1 END) AS [3],
COUNT(CASE WHEN td.CallTypeId = 4 AND Duration <> '00:00:00.000' THEN 1 END) AS [4],
COUNT(CASE WHEN td.CallTypeId = 5 AND Duration <> '00:00:00.000' THEN 1 END) AS [5],
SUM(DATEDIFF(MILLISECOND, 0,ISNULL(Duration,'00:00:00')))
FROM #TmpData AS td
GROUP BY CAST( td.CallDateTime AS DATE);


EDIT: I made a mistake on the Types 4 and 5. I have corrected it and made it even simpler.

You can make it even simpler. Null values are ignored in SUM, so there is no reason to convert it to the additive identity, i.e., zero.


SELECT CAST( td.CallDateTime AS DATE),
COUNT(CASE WHEN td.CallTypeId = 1 THEN 1 END) AS [1],
COUNT(CASE WHEN td.CallTypeId = 2 THEN 1 END) AS [2],
COUNT(CASE WHEN td.CallTypeId = 3 THEN 1 END) AS [3],
COUNT(CASE WHEN td.CallTypeId = 4 AND Duration <> '00:00:00.000' THEN 1 END) AS [4],
COUNT(CASE WHEN td.CallTypeId = 5 AND Duration <> '00:00:00.000' THEN 1 END) AS [5],
SUM(DATEDIFF(MILLISECOND, 0,Duration))
FROM #TmpData AS td
GROUP BY CAST( td.CallDateTime AS DATE);


Drew


Except that in some cases, you might get a NULL instead of a zero. It's a matter of what you expect as part of the results.


Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Joe Celko
Joe Celko
SSCertifiable
SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)SSCertifiable (5.2K reputation)

Group: General Forum Members
Points: 5162 Visits: 3068
[quote]
Tom257 - Wednesday, February 28, 2018 7:19 AM
I've not had to use SQL for a while and I'm having trouble getting to grips with aggregates again.

In the code below I think its This line that is not doing what I expect

SUM(S.CallCount) OVER(PARTITION BY S.CallTypeId, S.CallDate) AS NumCalls

I want to get a report of counts by Day and Type - with the added factor of ignoring some items in the count - the sample data is much simplified to show the specific issue I want to solve.
Here is a Sample table

-- DROP TABLE #TmpData;
CREATE TABLE #TmpData (
CallDateTime DateTime,
Duration Time,
CallTypeId Int )
INSERT INTO #TmpData
SELECT '2017-08-01 10:14:37.00', '00:00:03.450', 1 UNION ALL
SELECT '2017-08-01 10:55:46.00', NULL, 2 UNION ALL
SELECT '2017-08-01 11:30:50.00', NULL, 2 UNION ALL
SELECT '2017-08-01 12:48:02.00', '00:00:02.500', 1 UNION ALL
SELECT '2017-08-01 14:48:24.00', '00:00:02.910', 1 UNION ALL
SELECT '2017-08-01 17:05:26.00', '00:00:04.460', 1 UNION ALL
SELECT '2017-08-04 02:11:47.00', '00:13:54.790', 4 UNION ALL
SELECT '2017-08-06 01:29:13.00', '00:01:18.180', 5 UNION ALL
SELECT '2017-08-06 04:33:33.00', NULL, 3 UNION ALL
SELECT '2017-08-06 04:33:45.00', '00:00:00.000', 5 UNION ALL
SELECT '2017-08-06 04:33:49.00', NULL, 3 UNION ALL
SELECT '2017-08-06 04:35:59.00', NULL, 3 UNION ALL
SELECT '2017-08-06 05:08:22.00', '00:00:00.000', 5 UNION ALL
SELECT '2017-08-06 20:51:28.00', '00:00:03.310', 4 UNION ALL
SELECT '2017-08-08 15:49:27.00', '00:00:19.000', 5 UNION ALL
SELECT '2017-08-10 15:24:19.00', '00:04:32.790', 4 UNION ALL
SELECT '2017-08-10 18:12:30.00', NULL, 2 UNION ALL
SELECT '2017-08-10 18:13:01.00', NULL, 3 UNION ALL
SELECT '2017-08-10 18:40:06.00', NULL, 2 UNION ALL
SELECT '2017-08-10 22:35:13.00', '00:01:33.710', 4 ;


Here is my report so far :
Its wrong! it produces counts of 1 everywhere there should be a count, but some of those 1's should be true counts e.g. Column [1] on the first date [2017-08-01 should be 4
And it seems far too complicated - (I've gone CTE crazy here) - in addition to fixing the counts there has got to be a simpler way



;WITH SplitDates AS (
SELECT CAST(N.CallDateTime AS date) AS CallDate,
DATEDIFF(MILLISECOND, '1900-01-01',ISNULL(N.Duration,'00:00:00')) AS [DurationMs],
N.CallTypeId,
-- Rules - some circumstances are not counted
CASE WHEN N.CallTypeID = 4 AND ISNULL(N.Duration,'00:00:00.000') = '00:00:00.000' THEN 0
WHEN N.CallTypeID = 5 AND ISNULL(N.Duration,'00:00:00.000') = '00:00:00.000' THEN 0
ELSE 1
END AS CallCount
FROM #TmpData AS N
), DurationByDay AS (
SELECT CallDate, SUM(DurationMs) AS [Total Duration]
FROM SplitDates
GROUP BY CallDate
), SplitDays AS (
SELECT S.CallTypeId,
SUM(S.CallCount) OVER(PARTITION BY S.CallTypeId, S.CallDate) AS NumCalls,
S.CallDate
FROM SplitDates AS S
GROUP BY S.CallDate, S.CallTypeId, S.CallCount
), PivotNumbers AS (
SELECT CallDate, [1], [2], [3], [4], [5] , [Total Duration] -- TODO Convert back to HH:mm:ss here
FROM ( SELECT CallTypeId, T.CallDate, NumCalls, D.[Total Duration]
FROM SplitDays AS T
JOIN DurationByDay AS D ON D.CallDate = T.CallDate
) AS SourceTable
PIVOT ( SUM(NumCalls)
FOR CallTypeId IN ([1], [2], [3], [4], [5]) ) AS PivotTable
)
SELECT *
FROM PivotNumbers
ORDER BY CallDate
>>I've not had to use SQL for a while and I'm having trouble getting togrips with aggregates again. <<




actually,you have some more fundamental problems. What you posted is not a table! By definition, a table must have a key and you have no way to ever have a key, because everything is nullable. You then committed a design fallacy called “attribute splitting” by putting a date, a timestamp and a time in their own columns. Let’s go ahead and try and correct your design and bring you up to the current SQL standards(well, actually, what’s been in effect for over 15 years).




Please think about how silly a “<something>_type_id” is; what is your blood type ID? If you read a book on basic data modeling, youwould learn that an attribute property (that’s the ISO term for it)can be a type or an ID, but not both in some weird silly hybrid.Also, neither a type nor an identifier can be numeric; what math were you going to do on it? What is the square root of your credit cardnumber?

CREATE TABLE Foobars

(call_start_timestamp DATETIME2(0) NOT NULL PRIMARY KEY,

call_end_timestamp DATETIME2(0),

CHECK(call_start_timestamp <= call_end_timestamp ) ,

call_typeCHAR(1) NOT NULL

CHECK(call_type IN (‘01’, ‘02’. ‘03’, ’04’);




Please use the ANSI/ISO syntax for insertion. Why still use the old Sybase conventions? When you use the original Sybase row-at-a time, theoptimizer can do anything with it. Finally, you made a huge conceptual error; TIME data type is not a duration; it’s a clocktime! What you’re doing is fundamentally wrong. This really isn’tyour fault, so much as the way we measure time. Get a copy of theRick Snodgrass PDF download from the University of Arizona on temporal data in SQL. In SQL server you’re going to need a start and end timestamp (Microsoft has no INTERVAL data type). It would look like this:

[code language="sql"]INSERT INTO Foobars

VALUES

('2017-08-0110:55:46', '2017-08-01 10:55:46', '02'),

('2017-08-0111:30:50', '2017-08-01 11:30:50', '02'),

etc




>I want to get a report[sic:queries are not reports;]of counts by call_dateand call_type, with the added factor of ignoring some items in the count - thesample data is much simplified to show the specific issue I want tosolve. <<




UseCAST (call_start_timestamp AS DATE) AS all_date for your grouping.






>>I've not had to use SQL for a while and I'm having trouble getting togrips with aggregates again. <<




actually,you have some more fundamental problems. What you posted is not atable! By definition, a table must have a key and you have no way to ever have a key, because everything is nullable. You then committed a design fallacy called “attribute splitting” by putting a date, a timestamp and a time in their own columns. Let’s go ahead and try and correct your design and bring you up to the current SQL standards(well, actually, what’s been in effect for over 15 years).




Pleasethink about how silly a “<something>_type_id” is; what isyour blood type ID? If you read a book on basic data modeling, youwould learn that an attribute property (that’s the ISO term for it)can be a type or an ID, but not both in some weird silly hybrid.Also, neither a type nor an identifier can be numeric; what math wereyou going to do on it? What is the square root of your credit cardnumber?




CREATETABLE Foobars

(call_start_timestamp DATETIME2(0) NOT NULL PRIMARY KEY,

call_end_timestamp DATETIME2(0),

CHECK(call_start_timestamp <= call_end_timestamp ) ,

call_typeCHAR(1) NOT NULL

CHECK(call_type IN (‘01’, ‘02’. ‘03’, ’04’);




Pleasethe ANSI/ISO syntax for insertion. Why still use the old Sybaseconventions? When you use the original Sybase row-at-a time, theoptimizer can do anything with it. Finally, you made a hugeconceptual error; TIME data type is not a duration; it’s a clocktime! What you’re doing is fundamentally wrong. This really isn’tyour fault, so much as the way we measure time. Get a copy of theRick Snodgrass PDF download from the University of Arizona ontemporal data in SQL. In SQL server you’re going to need a startand end timestamp (Microsoft has no INTERVAL data type). It wouldlook like this:




INSERTINTO Foobars

VALUES

('2017-08-0110:55:46', '2017-08-01 10:55:46', '02'),

('2017-08-0111:30:50', '2017-08-01 11:30:50', '02'),

etc




>I want to get a report[sic:queries are not reports;]of counts by call_dateand call_type, with the added factor of ignoring some items in the count - thesample data is much simplified to show the specific issue I want tosolve. <<




Use CAST (call_start_timestamp AS DATE) AS all_date for your grouping.




Pleasethe ANSI/ISO syntax for insertion. Why still use the old Sybaseconventions? When you use the original Sybase row-at-a time, theoptimizer can do anything with it. Finally, you made a hugeconceptual error; TIME data type is not a duration; it’s a clocktime! What you’re doing is fundamentally wrong. This really isn’tyour fault, so much as the way we measure time. Get a copy of theRick Snodgrass PDF download from the University of Arizona ontemporal data in SQL. In SQL server you’re going to need a startand end timestamp (Microsoft has no INTERVAL data type). It wouldlook like this: INSERTINTO FoobarsVALUES('2017-08-0110:55:46', '2017-08-01 10:55:46', '02'),('2017-08-0111:30:50', '2017-08-01 11:30:50', '02'), etc>I want to get a report[sic:queries are not reports;]of counts by call_dateand call_type, with the added factor of ignoring some items in the count - thesample data is much simplified to show the specific issue I want tosolve. <<UseCAST (call_start_timestamp AS DATE) AS all_date for your grouping. >>I've not had to use SQL for a while and I'm having trouble getting togrips with aggregates again. <<actually,you have some more fundamental problems. What you posted is not atable! By definition, a table must have a key and you have no way toever have a key, because everything is nullable. You then committed adesign fallacy called “attribute splitting” by putting a date, atimestamp and a time in their own columns. Let’s go ahead and tryand correct your design and bring you up to the current SQLstandards(well, actually, what’s been in effect for over 15 years).Pleasethink about how silly a “<something>_type_id” is; what isyour blood type ID? If you read a book on basic data modeling, youwould learn that an attribute property (that’s the ISO term for it)can be a type or an ID, but not both in some weird silly hybrid.Also, neither a type nor an identifier can be numeric; what math wereyou going to do on it? What is the square root of your credit cardnumber?CREATETABLE Foobars(call_start_timestamp DATETIME2(0) NOT NULL PRIMARY KEY, call_end_timestamp DATETIME2(0), CHECK(call_start_timestamp <= call_end_timestamp ) , call_typeCHAR(1) NOT NULL CHECK(call_type IN (‘01’, ‘02’. ‘03’, ’04’);Pleasethe ANSI/ISO syntax for insertion. Why still use the old Sybaseconventions? When you use the original Sybase row-at-a time, theoptimizer can do anything with it. Finally, you made a hugeconceptual error; TIME data type is not a duration; it’s a clocktime! What you’re doing is fundamentally wrong. This really isn’tyour fault, so much as the way we measure time. Get a copy of theRick Snodgrass PDF download from the University of Arizona ontemporal data in SQL. In SQL server you’re going to need a startand end timestamp (Microsoft has no INTERVAL data type). It wouldlook like this: INSERTINTO FoobarsVALUES('2017-08-0110:55:46', '2017-08-01 10:55:46', '02'),('2017-08-0111:30:50', '2017-08-01 11:30:50', '02'), etc>I want to get a report[sic:queries are not reports;]of counts by call_dateand call_type, with the added factor of ignoring some items in the count - thesample data is much simplified to show the specific issue I want tosolve. <<UseCAST (call_start_timestamp AS DATE) AS all_date for your grouping. [/code]
etc>I want to get a report[sic:queries are not reports;]of counts by call_dateand call_type, with the added factor of ignoring some items in the count - thesample data is much simplified to show the specific issue I want tosolve. <<UseCAST (call_start_timestamp AS DATE) AS all_date for your grouping. >>I've not had to use SQL for a while and I'm having trouble getting togrips with aggregates again. <<actually,you have some more fundamental problems. What you posted is not atable! By definition, a table must have a key and you have no way toever have a key, because everything is nullable. You then committed adesign fallacy called “attribute splitting” by putting a date, atimestamp and a time in their own columns. Let’s go ahead and tryand correct your design and bring you up to the current SQLstandards(well, actually, what’s been in effect for over 15 years).Pleasethink about how silly a “<something>_type_id” is; what isyour blood type ID? If you read a book on basic data modeling, youwould learn that an attribute property (that’s the ISO term for it)can be a type or an ID, but not both in some weird silly hybrid.Also, neither a type nor an identifier can be numeric; what math wereyou going to do on it? What is the square root of your credit cardnumber?CREATETABLE Foobars(call_start_timestamp DATETIME2(0) NOT NULL PRIMARY KEY, call_end_timestamp DATETIME2(0), CHECK(call_start_timestamp <= call_end_timestamp ) , call_typeCHAR(1) NOT NULL CHECK(call_type IN (‘01’, ‘02’. ‘03’, ’04’);Pleasethe ANSI/ISO syntax for insertion. Why still use the old Sybaseconventions? When you use the original Sybase row-at-a time, theoptimizer can do anything with it. Finally, you made a huge conceptual error; TIME data type is not a duration; it’s a clocktime! What you’re doing is fundamentally wrong. This really isn’t your fault, so much as the way we measure time. Get a copy of the Rick Snodgrass PDF download from the University of Arizona on temporal data in SQL. In SQL server you’re going to need a start and end timestamp (Microsoft has no INTERVAL data type). It wouldlook like this: INSERT INTO Foobars VALUES('2017-08-0110:55:46', '2017-08-01 10:55:46', '02'),('2017-08-0111:30:50', '2017-08-01 11:30:50', '02'), etc>I want to get a report[sic:queries are not reports;]of counts by call_date and call_type, with the added factor of ignoring some items in the count - the sample data is much simplified to show the specific issue I want to solve. <<
UseCAST (call_start_timestamp AS DATE) AS all_date for your grouping. [/code]Please use the ANSI/ISO syntax for insertion. Why still use the old Sybaseconventions? When you use the original Sybase row-at-a time, the optimizer can do anything with it. Finally, you made a hugeconceptual error; TIME data type is not a duration; it’s a clocktime! What you’re doing is fundamentally wrong. This really isn’tyour fault, so much as the way we measure time. Get a copy of theRick Snodgrass PDF download from the University of Arizona on temporal data in SQL. In SQL server you’re going to need a start and end timestamp (Microsoft has no INTERVAL data type). It would look like this: INSERT INTO Foobars VALUES('2017-08-0110:55:46', '2017-08-01 10:55:46', '02'),('2017-08-0111:30:50', '2017-08-01 11:30:50', '02'), etc>I want to get a report[sic:queries are not reports;]of counts by call_date and call_type, with the added factor of ignoring some items in the count - the sample data is much simplified to show the specific issue I want to solve. <<Use Cast (call_start_timestamp AS DATE) AS all_date for your grouping.
>>I've not had to use SQL for a while and I'm having trouble getting togrips with aggregates again. <<actually,you have some more fundamental problems. What you posted is not a
table! By definition, a table must have a key and you have no way toever have a key, because everything is nullable. You then committed adesign fallacy called “attribute splitting” by putting a date, atimestamp and a time in their own columns. Let’s go ahead and tryand correct your design and bring you up to the current SQL standards(well, actually, what’s been in effect for over 15 years).Pleasethink about how silly a “<something>_type_id” is; what is your blood type ID? If you read a book on basic data modeling, you would learn that an attribute property (that’s the ISO term for it)can be a type or an ID, but not both in some weird silly hybrid.Also, neither a type nor an identifier can be numeric; what math wereyou going to do on it? What is the square root of your credit cardnumber?

CREATETABLE Foobars(call_start_timestamp DATETIME2(0) NOT NULL PRIMARY KEY, call_end_timestamp DATETIME2(0), CHECK(call_start_timestamp <= call_end_timestamp ) , call_typeCHAR(1) NOT NULL CHECK(call_type IN (‘01’, ‘02’. ‘03’, ’04’);Pleasethe ANSI/ISO syntax for insertion. Why still use the old Sybase conventions? When you use the original Sybase row-at-a time, the optimizer can do anything with it. Finally, you made a huge conceptual error; TIME data type is not a duration; it’s a clocktime! What you’re doing is fundamentally wrong. This really isn't your fault, so much as the way we measure time. Get a copy of the Rick Snodgrass PDF download from the University of Arizona on temporal data in SQL. In SQL server you’re going to need a startand end timestamp (Microsoft has no INTERVAL data type). It would look like this: INSERTINTO Foobars VALUES('2017-08-0110:55:46', '2017-08-01 10:55:46', '02'),('2017-08-0111:30:50', '2017-08-01 11:30:50', '02'), etc>I want to get a report[sic:queries are not reports;]of counts by call_date and call_type, with the added factor of ignoring some items in the count - thesample data is much simplified to show the specific issue I want to solve. <<Use CAST (call_start_timestamp AS DATE) AS all_date for your grouping.

Please post DDL and follow ANSI/ISO standards when asking for help.
Luis Cazares
Luis Cazares
SSC Guru
SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)SSC Guru (149K reputation)

Group: General Forum Members
Points: 149218 Visits: 22100
Bravo! You really out did yourself, Joe.
I wonder if anyone can make sense of your post.


Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Tom257
Tom257
SSCrazy Eights
SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)SSCrazy Eights (10K reputation)

Group: General Forum Members
Points: 9975 Visits: 1503
@Joe Celko

Wow Joe, thanks for the thorough treatise!

In answer: The original table does have a key, I omitted it as part of the simplification process as it plays no part in the output, as I omitted about 30 other columns from the original table.

The data presented is intended to isolate the specific problem I was running into, which enabled Luis Cazares to immediately see where I was going wrong.

The Duration column - You're right, it's not good to store a timespan in a time column - as you can see by converting to milliseconds for the SUM operation - but once again, the data is already there, and it doesn't include an end time. I guess that underlying this some physical device records a number of milliseconds or ticks and some *bright* developer decided to convert that to a time. If I ever get to that part of the system I'll see if its possible to store end time, or the timespan as a number.

CallType is a lookup table - and CallTypeId is a foreign key created automatically by some relationship modelling software - again non under my control and irrelevant to the problem.

And while you're marking my question like some schoolkid's homework BigGrin - maybe improve the way you present your comments, I probably missed half of what you said because of repetition and poor formatting. Crazy
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum








































































































































































SQLServerCentral


Search