SQL 2012 - Question on Group by Count Queries

  • Hi,

    I have a group by query that is not totally working.  See below.  The distinct count is working but my counts for all is not counting like it should be. I

    Select Distinct st, Pro_xtyp, [Fee Schedule], [fee type],Count ([PCS Number]) as 'All',

    Count (Distinct [PCS Number]) as 'Unique',[WC and MVA Check]

    FROM vw_All_Providers_Analysis_FOR_COUNTS

    Where ST IN ('PA','NJ') and AHS = 'X' and ProductCode = 'WCP_IMW' and [WC and MVA Check] = 'GOOD'

    Group by St, Pro_xtyp, [Fee Schedule], [fee type],[WC and MVA Check]

    order by ST, pro_xtyp

     PCS Number is our unique value. IF there is more than the same PCS number it should count it, which it does. but I think it is also count the rows in by table twice if there is a duplicate

    Example would be

    PCS      product      Address
    1234      WCP        123 state road
    1234      IMW         123 state road
    1234       WCP       12 olive drive

    So if I did the distinct count it would say that there is only 1 pcs, but if I didn't do distinct and I wanted to see how many location there were for this PCS number my query would say but that is not the case, it should only count 2. Why would it count or say 3? is it because I have more in my query than I should? Should my queries be simplified more?

    Thanks in advance.

  • wsilage - Tuesday, March 13, 2018 9:48 PM

    Hi,

    I have a group by query that is not totally working.  See below.  The distinct count is working but my counts for all is not counting like it should be. I

    Select Distinct st, Pro_xtyp, [Fee Schedule], [fee type],Count ([PCS Number]) as 'All',

    Count (Distinct [PCS Number]) as 'Unique',[WC and MVA Check]

    FROM vw_All_Providers_Analysis_FOR_COUNTS

    Where ST IN ('PA','NJ') and AHS = 'X' and ProductCode = 'WCP_IMW' and [WC and MVA Check] = 'GOOD'

    Group by St, Pro_xtyp, [Fee Schedule], [fee type],[WC and MVA Check]

    order by ST, pro_xtyp

     PCS Number is our unique value. IF there is more than the same PCS number it should count it, which it does. but I think it is also count the rows in by table twice if there is a duplicate

    Example would be

    PCS      product      Address
    1234      WCP        123 state road
    1234      IMW         123 state road
    1234       WCP       12 olive drive

    So if I did the distinct count it would say that there is only 1 pcs, but if I didn't do distinct and I wanted to see how many location there were for this PCS number my query would say but that is not the case, it should only count 2. Why would it count or say 3? is it because I have more in my query than I should? Should my queries be simplified more?

    Thanks in advance.

    Remove the DISTINCT keyword from the SELECT list.
    You need to know *exactly* how your data is structured and DISTINCT is too vague, especially when mixed with a tightly-defined aggregate such as the GROUP BY.
    DISTINCT and GROUP BY are used separately for different purposes. If there are duplicates in your source data, then a more appropriate usage might be DISTINCT in a derived table then GROUP BY in the outer query.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Thank you for the info Chris. In my view called vw_All_Providers_Analysis_FOR_COUNTS, I have a select query that is DISTINCT in there. There are a few rows in that query that would alter my counts though. That is why I was thinking maybe I need to create a different type of view to exclude thoese columns. 

    I have never used a derived outer query. I will have to look around to see how that is done.

    Thanks.

  • wsilage - Wednesday, March 14, 2018 6:29 AM

    Thank you for the info Chris. In my view called vw_All_Providers_Analysis_FOR_COUNTS, I have a select query that is DISTINCT in there. There are a few rows in that query that would alter my counts though. That is why I was thinking maybe I need to create a different type of view to exclude thoese columns. 

    I have never used a derived outer query. I will have to look around to see how that is done.

    Thanks.

    Sure no problem.
    FWIW if I'm doing anything related to counting rows, DISTINCT is unlikely to appear anywhere in the process, because it eliminates rows without leaving a count of how many were eliminated. Some or all of them may be relevant. If there are spurious rows anywhere in the process then the query isn't finished, it's still in development.
    A query incorporating a derived table looks like this:
    SELECT column_list
    FROM (
         SELECT column_List
         FROM ...
    ) d

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply