Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase

Not getting results expected from a Group by query Expand / Collapse
Author
Message
Posted Tuesday, January 15, 2013 8:54 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, February 11, 2013 7:32 AM
Points: 36, Visits: 120
Hi All,
Hope you can shed some light. I'm a little puzzled right now. I have the following query:
SELECT 'Independance' as SiteName, COUNT(att.AttendDate) AS Attendcount,
s.ShelterName,
(Select COUNT(PatientID) as PCount from PATIENTS where HomelessShelter = s.ShelterId and activeclient > 0) as PatientCount
FROM PATIENTS p RIGHT OUTER JOIN
APGAttendance Att ON p.PatientID = att.PatientID LEFT OUTER JOIN
Shelters s ON p.HomelessShelter = s.ShelterId
WHERE ((p.HomelessShelter > 0) )
AND (att.AttendDate BETWEEN '10/29/2012' AND '11/27/2012')
--and p.ActiveClient > 0
GROUP BY s.ShelterName,s.ShelterId
Some info on the Query. I need to know the total number of services that was provided to all clients in Shelters. And I also wanted to put the total number of clients that are actually in a particular shelter. Say I have 10 clients in shelter1 who have did 30 services within a date range given.
The Problem. I happen to notice the issue because of the total number of client in a shelter count. In one of the totals it give say 58 total visits for say shelter12, but the Client count is 0. I checked the database tables to make sure that there was not any clients assigned to that shelter and there was no assingment to that shelter. So why do I still get a count for that shelter?
Thanks for any input.
Post #1407311
Posted Tuesday, January 15, 2013 9:53 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Wednesday, August 27, 2014 1:06 PM
Points: 1,563, Visits: 2,392
Michael, read this article, then post some sample data and ddl. http://www.sqlservercentral.com/articles/Best+Practices/61537/. It's hard to figure out what you are trying to get at without seeing what you are seeing. I would probably get rid of the right outer join if I were you. There is nothing wrong with it, but it is easy to become confused if you are having left and right joins in the same query. Once you post the data and ddl, I'm sure you will get a good solution.

Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
Post #1407350
Posted Tuesday, January 15, 2013 10:21 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Yesterday @ 5:15 PM
Points: 1,945, Visits: 3,008
Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. On the good side, you pretty much follow ISO-11179 data element naming conventions AND formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible AND not local dialect.

Sample data is also a good idea, along with clear specifications. It is very hard to debug code when you do not let us see it.

What you did post has problems. Using scalar subqueries in a SELECT clause cannot be optimized, not even in theory! Your code assured bad performance. Why did a “shelter_id” for a homeless shelter become a totally different kind of entity than a shelter? You have a squid turning into automobile! That is absurd! This is just a type of shelter. In ISO-11179 Standards, we will have a “shelter_id_id”, where “shelter” is the attribute, “_id” is the attribute property (there are also “_size”, “_name”, etc). The prefix “homeless” is called an attribute role. But that is a bad design; this value is a type code, not a separate usage. Likewise “active_client” is a mess; “active” is a status code and you model it as a role! Likewise, “independence” is an adjective or type code, not a name. Then we have to ask how to correct this vague, useless “client_<something>” in the schema.

If we had DDL, we could make your code about an order of magnitude better. Yes, that much! MAYBE MORE! Yes, bad schema design is lethal in SQL. Even worse, your design has data integrity flaws so you are getting the wrong answer slowly.

The use of those OUTER JOINs is also a serious problem. OUTER JOINs of any kind are infrequent in a good design. They are slow, they create NULLs, etc. The RIGHT OUTER JOIN is very, very rate; most of us read from left to right (), so we have a cultural basis to LEFT OUTER (think Latin, Greek and Cyrillic alphabets).

I need to know the total number of services that was provided to all clients in Shelters. And I also wanted to put the total number of clients that are actually in a particular shelter. Say I have 10 clients in shelter= 1 who have did 30 services within a date range given.


How do we answer that without DDL?? Where are services in the schema?

The Problem. I happen to notice the issue because of the total number of client in a shelter count. In one of the totals it give say 58 total visits for say shelter = 12, but the Client count is 0. I checked the database tables to make sure that there was not any clients assigned to that shelter AND there was no assignment to that shelter. So why do I still get a count for that shelter?


Again, because the design is a mess that cannot have an data integrity by its very structure.


Books in Celko Series for Morgan-Kaufmann Publishing
Analytics and OLAP in SQL
Data and Databases: Concepts in Practice
Data, Measurements and Standards in SQL
SQL for Smarties
SQL Programming Style
SQL Puzzles and Answers
Thinking in Sets
Trees and Hierarchies in SQL
Post #1407364
Posted Tuesday, January 15, 2013 3:34 PM
SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Yesterday @ 3:14 PM
Points: 2,098, Visits: 3,155
All I can see from what you've posted so far is the check for:

activeclient > 0

in the subquery (patient count) but not in the outer query (attendance count).

It seems possible to me that the patients that made up the attendance count are no longer active in that shelter?!


SQL DBA,SQL Server MVP('07, '08, '09)

"We came in spastic, Like tameless horses /
We left in plastic, As numbered corpses / ...
Remember Charlie, Remember Baker /
They left their childhood On every acre /
And who was wrong? And who was right? /
It didn't matter in the thick of the fight." : the inimitable Mr. Billy Joel, about the Vietnam War
Post #1407510
Posted Tuesday, January 15, 2013 5:02 PM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Monday, February 11, 2013 7:32 AM
Points: 36, Visits: 120
Here are the DDL for the tables
====Patients====
[PatientID] [varchar](20) NOT NULL, --Primary Key
[LastName] [nvarchar](25) NULL,
[FirstName] [nvarchar](25) NULL,
[AKA] [varchar](50) NULL,
[ClinicianId] [int] NULL,
[MEDICAID_ID] [nvarchar](8) NULL,
[PendingMedicaid] [tinyint] NULL,
[PrivatePatient] [tinyint] NULL,
[MedicaidEffectiveDate] [datetime] NULL,
[MedicaidTermDate] [datetime] NULL,
[DOB] [datetime] NULL,
[SEX] [char](1) NULL,
[SSN] [varchar](12) NULL,
[ActiveClient] [smallint] NULL,
[HomelessShelter] smallint null
Remaining fields left out

===APGAttendance====
[PatientId] [varchar](20) NOT NULL, --Primary
[AttendDate] [datetime] NOT NULL, --Primary
[CTPCodes] [varchar](10) NOT NULL, --Primary
[CheckinBySecurty] [bit] NULL,
[SecurityTime] [varchar](12) NULL,
[GroupTime] [varchar](12) NULL,
[LocationID] [int] NULL,
[ClinicianID] [int] NULL,
Remaining fields left out to save space

CREATE TABLE [dbo].[Shelters](
[ShelterId] [int] IDENTITY(1,1) NOT NULL, --FOREIGN key to HomelessShelter
[ShelterName] [varchar](50) NULL,
[ShelterLocation] [varchar](100) NULL,
[SortOrder] [tinyint] NULL,
[isFreedomRes] [bit] NULL,
CONSTRAINT [PK_Shelters] PRIMARY KEY CLUSTERED
(
[ShelterId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

Here is the data for Shelter table:
ShelterId Sheltername
----------- --------------------------------------------------
1 Freedom Residence I
2 Freedom Residence II
3 Freedom Residence III
4 Freedom Residence IV
5 Freedom Residence V
6 Freedom Residence VI
7 Freedom Residence VII
8 New Sight to Life
9 Jastian Management
10 Freedom Residence VIII
11 Freedom Residence VIIII
12 Joining Hands
13 Jireh Place
14 #1 Marketing/RYB/Top of the Hob
15 Uplifting Men
16 No Matter What/Rockaway
17 Forever USA
18 Well Being/WINS Enterprise
19 Steps to Success
20 Freedom Residence X
21 Freedom Residence XI
22 Freedom Residence XII
23 Freedom Residence XVI
24 Liberation
25 Freedom Residence XVII
26 Freedom Residence XVIII
27 Freedom Residence XVIIII
28 Freedom Residence XX
29 Freedom Residence XXI
30 Freedom Residence XXII
31 Freedom Residence XIII
32 Freedom Residence XIIII
33 Freedom Residence XV

The Results from the Query in question.
NarcoSite Attendcount ShelterName PatientCount
------------ ----------- --------------------------- ------------
Independance 120 Freedom Residence I 10
Independance 212 Freedom Residence II 27
Independance 250 Freedom Residence III 37
Independance 96 Freedom Residence IV 16
Independance 165 Freedom Residence V 21
Independance 170 Freedom Residence VI 19
Independance 320 Freedom Residence VIII 41
Independance 156 Freedom Residence XI 13
Independance 193 Freedom Residence XII 24
Independance 319 Freedom Residence XVI 38
Independance 58 Freedom Residence XVIII 0 <--Row in question
Independance 352 Freedom Residence XVIIII 46
Independance 14 Freedom Residence XIIII 1

BWT, I'm currently have the field Sitename in the query to distinguish which site the data is coming from and will be used in a report the data is feeding. This is part of a Union all query that retrieves the data from several servers.

In reference to the row in question, I did check the patient table and know that there are no records that have the HomelessShelter field set to 26. Not sure why it would display 58 as the count.

I did try and change around the join clause with a left join instead of an outer join. Pretty much the same results though.

I understand that the subquery may not be the best way to get the Patient Count, but was not sure of a better way to get this info without using a function, which would have to use the same type query to get the count anyways, so for now I had it within the parent query. If there is a better way, I will be happy to do it.
Thanks for the suggestions so far.
Post #1407519
Posted Wednesday, January 16, 2013 8:30 AM


SSCommitted

SSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommittedSSCommitted

Group: General Forum Members
Last Login: Wednesday, August 27, 2014 1:06 PM
Points: 1,563, Visits: 2,392
Here are the DDL for the tables

Here is the data for Shelter table:


Michael, did you read the link I posted yesterday? *All of it*? You are heading in the right direction with respect to the ddl and sample data, but you still need to do a little more. We should be able to copy and paste your ddl and sample data and run it as is, without having to type any create and insert statements. Once you do this, you will be more likely to get help. Now, there are folks who will do this for you (sometimes I do, but don't have time today), so you could wait and see if someone does, or you could help us help you.


Greg
_________________________________________________________________________________________________
The glass is at one half capacity: nothing more, nothing less.
Post #1407876
« Prev Topic | Next Topic »

Add to briefcase

Permissions Expand / Collapse