January 20, 2006 at 4:55 am
Hi
I’m trying to suss out how to write a query to produce a grouped report giving columns with counts and or sums for separate periods. The relevant bits of the two main tables are:
tblData:
Site Category Date(SmDateTime)
tblSource:
Site
The report I’m aiming for should look like this:
Site Category Pd1 Pd2
Site1 A Count Count
Site1 B Count Count
Site1 C Count Count
Site2 A Count Count
Site2 B Count Count
Site2 C Count Count
I’d like to list all sites and all categories showing a count of 0 where there are nulls. Should I create a temporary table or a view that combines the sites and categories and left join the data table to that. Any guidance would be greatly appreciated.
Cheers
Geoff
January 20, 2006 at 5:24 am
Could you give some testdata for tables tblData and tblSource along with the expected output of the query?
January 20, 2006 at 6:57 am
Hi
After some fiddling about in Excel I've produced the following:
tblData:
labID | Source | SpecCode | TestDate | TestCode | colPK |
T | site1 | ME5 | 11/01/2004 | 17HP | 30 |
T | site1 | ME5 | 12/01/2004 | 17HP | 36 |
M | site1 | GP | 02/01/2004 | 2 | |
B | site1 | ME2 | 03/01/2004 | 17HP | 11 |
B | site1 | ME2 | 04/01/2004 | 17HP | 12 |
B | site1 | ME2 | 05/01/2004 | 17HP | 13 |
B | site1 | ME5 | 06/01/2004 | 17HP | 10 |
B | site1 | ME5 | 07/01/2004 | 17HP | 14 |
B | site1 | ME5 | 08/01/2004 | 17HP | 16 |
B | site1 | ME5 | 09/01/2004 | 17HP | 17 |
B | site1 | ME5 | 10/01/2004 | 17HP | 22 |
T | site2 | GP | 01/01/2004 | 17HP | 31 |
T | site2 | ME5 | 21/01/2004 | 17HP | 48 |
T | site2 | ME5 | 22/01/2004 | 17HP | 28 |
T | site2 | ME5 | 23/01/2004 | 17HP | 29 |
T | site2 | ME5 | 24/01/2004 | 17HP | 32 |
T | site2 | ME5 | 25/01/2004 | 17HP | 37 |
T | site2 | ME5 | 26/01/2004 | 17HP | 38 |
T | site2 | ME5 | 27/01/2004 | 17HP | 39 |
B | site2 | ME5 | 13/01/2004 | 17HP | 18 |
B | site2 | ME5 | 14/01/2004 | 17HP | 19 |
B | site2 | ME5 | 15/01/2004 | 17HP | 20 |
B | site2 | ME5 | 16/01/2004 | 17HP | 21 |
B | site2 | ME5 | 17/01/2004 | 17HP | 23 |
B | site2 | ME5 | 18/01/2004 | 17HP | 24 |
B | site2 | ME5 | 19/01/2004 | 17HP | 25 |
B | site2 | ME5 | 20/01/2004 | 17HP | 26 |
M | site3 | ME5 | 28/01/2004 | 17HP | 44 |
M | site3 | ME5 | 29/01/2004 | 17HP | 45 |
M | site3 | ME5 | 30/01/2004 | 17HP | 46 |
M | site3 | ME5 | 31/01/2004 | 17HP | 47 |
B | site1 | ME5 | 07/02/2004 | 11DE | 5 |
B | site1 | ME5 | 08/02/2004 | 11DE | 6 |
B | site1 | ME5 | 09/02/2004 | 11DE | 7 |
B | site1 | ME5 | 10/02/2004 | 11DE | 8 |
B | site1 | ME5 | 11/02/2004 | 11DE | 9 |
B | site1 | OH | 15/02/2004 | 17HP | 15 |
M | site1 | ME7 | 12/02/2004 | 1 | |
M | site1 | ME7 | 13/02/2004 | 4 | |
M | site1 | SU10 | 19/02/2004 | 3 | |
B | site2 | PA1 | 16/02/2004 | 17HP | 27 |
T | site2 | NK | 14/02/2004 | 17HP | 35 |
T | site2 | PA1 | 17/02/2004 | 17HP | 40 |
M | site3 | ME5 | 01/02/2004 | 17HP | 50 |
M | site3 | PA1 | 18/02/2004 | 17HP | 49 |
T | site3 | ME5 | 02/02/2004 | 17HP | 33 |
T | site3 | ME5 | 03/02/2004 | 17HP | 34 |
T | site3 | ME5 | 04/02/2004 | 17HP | 41 |
T | site3 | ME5 | 05/02/2004 | 17HP | 42 |
T | site3 | ME5 | 06/02/2004 | 17HP | 43 |
tblSource:
pathcode | Entity | Organisation Name | RO Code | HA Code | Address Line 1 | Address Line 2 | Address Line 4 | Address Line 5 | Postcode |
site1 | x | Dr Alias | Y43 | Q02 | The Clinic | High Street | Anytown | Hertfordshire | SG1 4AB |
site2 | z | Dr Smith | Y43 | Q02 | The Surgery | Main Road | Acity | Hertfordshire | SG1 4AB |
site3 | z | Dr Jones | Y43 | Q02 | The Health Centre | Back Lane | Thevillage | Hertfordshire | SG1 4AB |
Ideal report with additional grouping:
Entity x | Jan | Feb | |
Site1 | Dr Alias | ||
Lab B | 2 | 6 | |
Lab M | 1 | 3 | |
Lab T | 8 | 0 | |
Entity y | Jan | Feb | |
Site2 | Dr Smith | ||
Lab B | 8 | 1 | |
Lab M | 8 | 0 | |
Lab T | 0 | 2 | |
Site1 | Dr Jones | ||
Lab B | 0 | 0 | |
Lab M | 4 | 2 | |
Lab T | 0 | 5 |
I think I copied the subtotals OK. There a good few addtional columns in the data table and some in the sources one but they are not relevant to the query although I've left a few in for the feel of it. The data table contains about 3,500,000 lines at present. I hope this sheds some more light on the matter.
Cheers
Geoff
January 20, 2006 at 7:07 am
I still don't see how these numbers come up. Consider Entity x, Site1, Lab B. This should be 2 in Jan and 6 in Feb. Why? Also, in your desired result, where does entity y come from? (Or should this be entity z?)
January 20, 2006 at 7:50 am
Hi
Sorry a few errors in the report, I've just gone back into Excel and make x, site1, labB 8 in Jan and 6 in Feb. You're right abou the entity - should be z. Also the line site1 under z should be site3
Should look like this:
Entity x | Jan | Feb | |
Site1 | Dr Alias | ||
Lab B | 8 | 6 | |
Lab M | 1 | 3 | |
Lab T | 2 | 0 | |
Entity z | Jan | Feb | |
Site2 | Dr Smith | ||
Lab B | 8 | 1 | |
Lab M | 0 | 0 | |
Lab T | 8 | 2 | |
Site3 | Dr Jones | ||
Lab B | 0 | 0 | |
Lab M | 4 | 2 | |
Lab T | 0 | 5 | |
31 |
January 20, 2006 at 7:58 am
A first attempt, I haven't joined with tblSource yet. Am I on the right track?
select
source,
labid,
sum(case when month(testdate) = 1 then 1 else 0 end),
sum(case when month(testdate) = 2 then 1 else 0 end)
from tblData group by source, labid
January 20, 2006 at 8:18 am
Hi
I see what you're saying, but what will happen in instances where a source has does not have an entry in the period/s?
January 20, 2006 at 8:20 am
Sorry
Just looked again - can see now.
January 20, 2006 at 8:27 am
Ok - does this give you what you want?
select
s.entity,
s.organisation,
d.source,
d.labid,
sum(case when month(d.testdate) = 1 then 1 else 0 end),
sum(case when month(d.testdate) = 2 then 1 else 0 end)
from tblSource s left join tblData d
on s.pathcode = d.source
group by s.entity, s.organisation, d.source, d.labid
order by s.entity, s.organisation, d.source, d.labid
January 20, 2006 at 8:31 am
Hi
This looks like it might do it I was just having a look to see if I can add the year to the case statement.
January 23, 2006 at 2:57 am
Hi Jesper
Sorry I didn't get back to you more positively on Friday. It was certainly very helpful. My script now looks like this:
USE PathData
SELECT d.TestDate, d.Source, d.labid, d.colPK
INTO tblDataNoXXXXX
FROM tblData d JOIN tblTestLib t
ON d.TestCode = t.[Test Code]
WHERE [Price Band] <> 'XXXXX'
GO
SELECT
s.entity, s.[organisation name], s.Site, d.source, d.labid,
SUM(CASE WHEN (YEAR(d.TestDate) = 2004 AND MONTH(d.TestDate) = 4) THEN 1 ELSE 0 end) AS 'APR 04',
SUM(CASE WHEN (YEAR(d.TestDate) = 2004 AND MONTH(d.TestDate) = 5) THEN 1 ELSE 0 end) AS 'MAY 04',
SUM(CASE WHEN (YEAR(d.TestDate) = 2004 AND MONTH(d.TestDate) = 6) THEN 1 ELSE 0 end) AS 'JUN 04',
SUM(CASE WHEN (YEAR(d.TestDate) = 2004 AND MONTH(d.TestDate) = 7) THEN 1 ELSE 0 end) AS 'JUL 04',
SUM(CASE WHEN (YEAR(d.TestDate) = 2004 AND MONTH(d.TestDate) = 8) THEN 1 ELSE 0 end) AS 'AUG 04',
SUM(CASE WHEN (YEAR(d.TestDate) = 2004 AND MONTH(d.TestDate) = 9) THEN 1 ELSE 0 end) AS 'SEP 04',
SUM(CASE WHEN (YEAR(d.TestDate) = 2004 AND MONTH(d.TestDate) = 10) THEN 1 ELSE 0 end) AS 'OCT 04',
SUM(CASE WHEN (YEAR(d.TestDate) = 2004 AND MONTH(d.TestDate) = 11) THEN 1 ELSE 0 end) AS 'NOV 04',
SUM(CASE WHEN (YEAR(d.TestDate) = 2004 AND MONTH(d.TestDate) = 12) THEN 1 ELSE 0 end) AS 'DEC 04',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 1) THEN 1 ELSE 0 end) AS 'JAN 05',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 2) THEN 1 ELSE 0 end) AS 'FEB 05',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 3) THEN 1 ELSE 0 end) AS 'MAR 05',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 4) THEN 1 ELSE 0 end) AS 'APR 05',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 5) THEN 1 ELSE 0 end) AS 'MAY 05',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 6) THEN 1 ELSE 0 end) AS 'JUN 05',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 7) THEN 1 ELSE 0 end) AS 'JUL 05',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 8) THEN 1 ELSE 0 end) AS 'AUG 05',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 9) THEN 1 ELSE 0 end) AS 'SEP 05',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 10) THEN 1 ELSE 0 end) AS 'OCT 05',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 11) THEN 1 ELSE 0 end) AS 'NOV 05',
SUM(CASE WHEN (YEAR(d.TestDate) = 2005 AND MONTH(d.TestDate) = 12) THEN 1 ELSE 0 end) AS 'DEC 05'
FROM tblSources s left join tblDataNoXXXXX d
ON s.pathcode = d.source
WHERE s.entity IN ('5GG', '5GH', '5GJ', '5GK')
GROUP BY s.entity, s.[organisation name], s.Site, d.source, d.labid
ORDER BY s.entity, s.[organisation name], s.Site, d.source, d.labid
GO
/*DROP TABLE tblDataNoXXXXX*/
GO
Is there a more elegant way of putting the select/join/tblTestLib and keep the indexing from tblData?
Cheers
Geoff
January 23, 2006 at 3:31 am
I didn't know you needed that many columns
Maybe the following query is better:
select
s.entity,
s.organisation,
d.source,
d.labid,
month(d.testdate),
year(d.testdate),
count(*)
from tblSource s left join tblData d
on s.pathcode = d.source
group by s.entity, s.organisation, d.source, d.labid, month(d.testdate), year(d.testdate)
order by s.entity, s.organisation, d.source, d.labid, month(d.testdate), year(d.testdate)
It changes the structure of the output a bit, but the data should still be there...
To produce the original output (if you need it), you might also insert the output of the select above into a new table and then query this table... I don't know if that's faster... You might also consider using the query above and then modify the result on the client, but I don't know it that's a possibility...
January 23, 2006 at 4:38 am
Hi
No worries - I just went wild and produced a set for all the data I have at present. This could be useful as I envisage having an Access front end on the db where users would select periods from/to on a form. I believe I can use VB to pass a variable from a form to a stored procedure?
In the mean time I can show them this (they have some other Ideas for me too, although I'm waiting for more details a priorties) and they can let me know exactly what they want.
Do you think I should keep the 'select into' tblNoXXXXX as a temporary measure or make it a permenant feature of the database? I am likely to need this selection in various other reports; I thought it might be a good idea to do this and then index the colPK. I intend to write some SPs to upload new data files, to the main data table, so I could include populating and indexing this at the same time.
Cheers
Geoff
January 23, 2006 at 5:16 am
1. I think so, yes. 2. I guess this depends on your requirements and the amount of data.
January 23, 2006 at 7:25 am
Thanks for your help Jesper :-)>
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply