COUNT without the INNER JOIN DUPLICATES

  • Hi,

    I have the first statement below which returns 108 records (no duplicates), if i remove the distinct keyword it will return 410 records(with duplicates)

    SELECT distinct dbo.tblUsers.strUserName, dbo.tblCards.dtePrintDate AS PrintDate, dbo.tblCardFormatTypes.strName AS CardType, dbo.tblPersons.CSU_ID AS ID_NUMBER, dbo.tblPersons.LAST_NAME, dbo.tblPersons.FIRST_NAME

    FROM dbo.tblUsers

    INNER JOIN dbo.tblEvents ON dbo.tblUsers.ID = dbo.tblEvents.UserID

    INNER JOIN dbo.tblCards ON dbo.tblEvents.CardID = dbo.tblCards.ID

    INNER JOIN dbo.tblCardFormatTypes ON dbo.tblCards.intCardFormatType = dbo.tblCardFormatTypes.ID

    INNER JOIN dbo.tblPersons ON dbo.tblCards.intPersonID = dbo.tblPersons.Universe_ID

    WHERE (dbo.tblEvents.EventType = 'CARD_PRINT' AND tblCards.dtePrintDate >= '31/03/2008' AND tblCards.dtePrintDate <= '1/04/2008 23:59:59')

    Now I need the statement above is a detail report. I need a summary of that the returns only the name of the User and how many(count) that user printed so I have this:

    SELECT a.strUserName, Count(b.UserID) AS CardsPrinted

    FROM dbo.tblUsers a

    inner JOIN dbo.tblEvents b ON a.ID = b.UserID

    inner JOIN dbo.tblCards c ON b.CardID = c.ID

    WHERE (b.EventType = 'CARD_PRINT' AND c.dtePrintDate >= '31/03/2008' AND

    c.dtePrintDate <= '1/04/2008 23:59:59')

    GROUP BY a.strUserName

    This statement returns:

    UserName CardsPrinted

    Username1 410

    The return includes the duplicate. How can I get the result I want without the duplicate.

    Thanks for any help.:)

  • If to follow the logic of your query this is what you are trying to get:

    SELECT a.strUserName, Count(A.ID) AS CardsPrinted

    FROM dbo.tblUsers a

    WHERE EXISTS(select 1

    from dbo.tblEvents b

    inner JOIN dbo.tblCards c ON b.CardID = c.ID

    where a.ID = b.UserID

    and (b.EventType = 'CARD_PRINT'

    AND c.dtePrintDate >= '31/03/2008'

    AND c.dtePrintDate <= '1/04/2008 23:59:59')

    )

    GROUP BY a.strUserName

    But id to look at the name of the column "CardsPrinted" I'd suspect the logic is incorrect.

    Can you explain what exactly you want to count?

    _____________
    Code for TallyGenerator

  • Thanks for the reply here are some details:

    I have these tables with some columns

    tblEvents:

    UserId

    CardID

    EventType

    tblUser:

    ID

    UserName

    tblCards:

    ID

    dtePrintDate

    intIssuedBY (same as UserID)

    ...(many more)

    I want to count in the tblEvents how many cards each user has printed.

    so the tblCards are records of cards printed

    tblEvents records the event CARD_PRINTED which contain the userid and the cardid.

    I hope this makes it clearer. Thanks for your help.

  • Why did you decide those 410 records are duplicates?

    In your DISTINCT query you group by PrintDate, CardType.

    So, if they have printed 10 cards of the same type on the same date you'll get 1 distinct line despite there were in fact 10 cards printed.

    Can you explain the logic of your query?

    _____________
    Code for TallyGenerator

  • The first query:

    SELECT distinct dbo.tblUsers.strUserName, dbo.tblCards.dtePrintDate AS PrintDate, dbo.tblCardFormatTypes.strName AS CardType, dbo.tblPersons.CSU_ID AS ID_NUMBER, dbo.tblPersons.LAST_NAME, dbo.tblPersons.FIRST_NAME

    FROM dbo.tblUsers

    INNER JOIN dbo.tblEvents ON dbo.tblUsers.ID = dbo.tblEvents.UserID

    INNER JOIN dbo.tblCards ON dbo.tblEvents.CardID = dbo.tblCards.ID

    INNER JOIN dbo.tblCardFormatTypes ON dbo.tblCards.intCardFormatType = dbo.tblCardFormatTypes.ID

    INNER JOIN dbo.tblPersons ON dbo.tblCards.intPersonID = dbo.tblPersons.Universe_ID

    WHERE (dbo.tblEvents.EventType = 'CARD_PRINT' AND tblCards.dtePrintDate >= '31/03/2008' AND tblCards.dtePrintDate <= '1/04/2008 23:59:59')

    I need to grab all records in tblEvents that have CARD_PRINT as eventType.

    Now using the CardID i will cross reference only those who have been printed by dtePrintDate.

    Using some FK i grab details from tblPerson, tblUsers

    There is no grouping in here. Like you mentioned by Card Type.

    The 410 records if without the DISTINCT have duplicate rows. These might be caused by the join. The records say on tblCards or tblEvents are not duplicate.

    Also what you said as having 10 cards printed for one type is really not related. But it can happen, although they should not appear as duplicate row since the dtePrintdate would be unique for each card printed.

    The problem is with how I can count number of rows in the tblEvents table where CARD_PRINTED for each user and then checking that the cardid in the rows fall into the required date.

    Is it clear?

    Thanks for the help. Hope you can point me to what is not right.

  • Can you add dbo.tblCards.ID to SELECT to see if those records are actually duplicates?

    If dbo.tblCards.ID will not repeat then they are not.

    Joins don't add duplications.

    Never.

    It's either bad data or wrong query logic.

    Can you figure out what it is in your case?

    _____________
    Code for TallyGenerator

  • Hi thanks for everything,

    I figured it out.

    The problem was the tblEvents table have duplicate cardID which should be okay but I thought it was wrong because in the actual tblCards there is only one record for say cardid = 1.

    The logic is correct but the data was somehow been changed since a delete on tblCards does not delete the events table.

    I think the better logic is to draw the report from the tblCards table itself and not from Events table.

    I thought inner join will cause some duplicates as ive read a lot of case like that from different sources when i googled it.

    Thanks again.

Viewing 7 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply