November 12, 2008 at 7:50 am
Hi Guys. Someone probably has dealt with this before but here goes:
I need to find the total number of users and number of users who placed an order. The answers are grouped according to a SegmentationCode which defines what user they are. I tried:
select count(*) As Users
,count(distinct ordUserID) As Purchasers
,sum(ordValue * 0.86) As Value
,sgmSegmentationCode
from users
left outer join UserSegmentationCode [SegCode] on users.usrID = [SegCode].sgmUserID
left outer join orders on orders.ordUserID = users.usrID
where ((users.usrLastPurchace > dateadd(yy,-3,@EndDate) and users.usrLastAccess > dateadd(yy,-3,@EndDate))
or (users.usrLastPurchace is null --and usrRegDate > dateadd(yy,-1,@EndDate)
and users.usrLastAccess > dateadd(yy,-1,@EndDate)))
and users.usrRegDate < @EndDate
and (orders.ordState >= 16000 or orders.ordDateOpened is null)
group by [SegCode].sgmSegmentationCode
For some reason, the Users count is incorrect. I had to split it up in to 2 queries to solve this. I think it's got to do with how left joins works - note I did do a check of OrdDateOpened is null - ie if there were no orders for a particular user.
November 12, 2008 at 8:09 am
If you aliased you tables and then used Alias.column it would be a bit easier to help as we don't know what columns go with what tables. That being said I would guess that one of your problems is that you have criteria in your where clause that apply to one of the left outer join tables. If you want restrictions on tables in your outer joins you should put them in the ON clause. This affects how the data is returned.
You should check out the link in my signature line on how to post schema and data so we can better help you.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
November 13, 2008 at 2:37 am
Sorry about that. I have added the aliases now in the first post. I don't want to place any restrictions in the ON clause, because I want to count all enabled users whether they have an order or not. The where clause defines who an enabled user is, basically someone who has not purchased/logged in after a certain period. ie. Need to count distinct users if they have an orderID of null or a proper orderID. Using distinct(user.usrID) or distinct (order.ordUserID) also doesn't work
November 13, 2008 at 4:14 am
asaacks (11/13/2008)
Sorry about that. I have added the aliases now in the first post. I don't want to place any restrictions in the ON clause, because I want to count all enabled users whether they have an order or not. The where clause defines who an enabled user is, basically someone who has not purchased/logged in after a certain period. ie. Need to count distinct users if they have an orderID of null or a proper orderID. Using distinct(user.usrID) or distinct (order.ordUserID) also doesn't work
Jack's advice is sound. He's explained accurately why your query doesn't work. "I don't want to" isn't really appropriate when you've sought, and been given, the free advice of an expert.
I don't want to place any restrictions in the ON clause, because I
want to count all enabled users whether they have an order or not.
Then put the restrictions on the orders table into the ON clause for the orders table, as suggested. It won't affect the count of users, whether they have an order or not, because orders is LEFT JOINed to users. Spend a little time reading about LEFT JOIN in BOL.
The where clause defines who an enabled user is, basically someone
who has not purchased/logged in after a certain period. ie. Need to count
distinct users if they have an orderID of null or a proper orderID
Referencing a left-joined table in the WHERE clause usually converts the join to a FULL JOIN, which in your case will result in users without an order being excluded from your result set. You won't get any 'users if they have an orderID of null'.
I'd set up this query a little differently:
[font="Courier New"]SELECT COUNT(*) AS Users
,SUM(CASE WHEN o.ordUserID IS NULL THEN 0 ELSE 1 END) AS Purchasers
,SUM(o.Value) AS Value
,sgmSegmentationCode
FROM users u
LEFT OUTER JOIN UserSegmentationCode [SegCode] ON u.usrID = [SegCode].sgmUserID
LEFT OUTER JOIN (SELECT ordUserID, SUM(ordValue * 0.86) AS Value
FROM orders
WHERE o.ordState >= 16000 OR o.ordDateOpened IS NULL
GROUP BY ordUserID
) o ON o.ordUserID = u.usrID
WHERE (
(u.usrLastPurchace > DATEADD(yy,-3,@EndDate) AND u.usrLastAccess > DATEADD(yy,-3,@EndDate))
OR (u.usrLastPurchace IS NULL AND u.usrLastAccess > DATEADD(yy,-1,@EndDate))
)
AND u.usrRegDate < @EndDate
GROUP BY [SegCode].sgmSegmentationCode
[/font]
... because I'm lazy and this is easier for me to read and understand.
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 4 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply