January 28, 2012 at 2:11 pm
Here is the query that is thrwoing me off....If I comment the ReportDate portion out, I get aCount as 728 and bCount as 716 which is the right count. But when I bring in ReportDate in the group by clause
the acutal counts are off. aCount comes to be 728 but bCount comes to be 725 and I am not able to figure out why. By the way ReportDate in the table is datetime.
Please help.
SELECT
login as UserID
,CONVERT(varchar(8), mydate, 112) as ReportDate
,COUNT(distinct a) as aCount
,COUNT(distinct b) as bCount
FROM A
WHERE CONVERT(varchar(8), mydate, 112) BETWEEN '20111001' AND '20111101'
and login = '2853'
GROUP BY Login,CONVERT(varchar(8), mydate, 112) as ReportDate
January 28, 2012 at 5:06 pm
I will be glad to help, if you follow the posting guidelines.
Please post your DDL and sample data.
January 28, 2012 at 5:20 pm
SQL_Surfer (1/28/2012)
Here is the query that is thrwoing me off....If I comment the ReportDate portion out, I get aCount as 728 and bCount as 716 which is the right count. But when I bring in ReportDate in the group by clausethe acutal counts are off. aCount comes to be 728 but bCount comes to be 725 and I am not able to figure out why. By the way ReportDate in the table is datetime.
Please help.
SELECT
login as UserID
,CONVERT(varchar(8), mydate, 112) as ReportDate
,COUNT(distinct a) as aCount
,COUNT(distinct b) as bCount
FROM A
WHERE CONVERT(varchar(8), mydate, 112) BETWEEN '20111001' AND '20111101'
and login = '2853'
GROUP BY Login,CONVERT(varchar(8), mydate, 112) as ReportDate
count(distinct) is evaluated AFTER "group by" which means that duplicates within "group by" are eliminated from the count, that's why.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.January 28, 2012 at 5:36 pm
Thanks. How do I get around that?
January 28, 2012 at 6:07 pm
I think that PaulB gave you a good hint:
When you eliminate grouping by the date, you get counts for one set only and all duplicates of b within that set are eliminated.
With grouping by the date, you are dealing with (potentially) 32 groups and duplicates of b are eliminated only if they fall within the same group, i.e., belong to the same date. If they fall within different groups, they are not eliminated and your overall count is therefore higher.
January 28, 2012 at 7:42 pm
SQL_Surfer (1/28/2012)
Thanks. How do I get around that?
Well... both result sets show you two different views of reality and cannot be used to validate each other.
Depending on business requirements and the way data was modeled in your particular system you have to decide which one is the one that represents reality.
_____________________________________
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at Amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.Viewing 6 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply