Select Distinct - Multiple Columns

  • I want to select ColumnA, ColumnB, ColumnC

    but, I want to check for Distinct only on ColumnA.

    I do need the other two columns because they are used in Calculated fields.

    Thank you.

  • Some sample data and required output would be very valuable here. See the first link in my signature for more info.

    You need to figure out if you'll do the group in the SQL query or at report rendering. Also, if you're doing aggregations on the data and you want to do it in the dataset not at rendering you'll need to use a GROUP BY Clause not the DISTINCT Keyword.

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • Luke L (9/21/2010)


    Some sample data and required output would be very valuable here. See the first link in my signature for more info.

    You need to figure out if you'll do the group in the SQL query or at report rendering. Also, if you're doing aggregations on the data and you want to do it in the dataset not at rendering you'll need to use a GROUP BY Clause not the DISTINCT Keyword.

    -Luke.

    If possible, I want to do everything at rendering. I just want the records that I need returned in the dataset. Distinct only on columnA.

    PS. I was wondering when you'd get to my post. 🙂 I saw your sig moving up the list of posts.:-P

  • so the sample data and expected output?

    If you're doing everything at rendering just pull everything, then group by columnA, add your calculations on columns b,c or whatnot. It's about the best I can do without something more concrete from you.

    I'm not sure at rendering is the best place to do all of your aggregation but YMMV depending on how much data you're talking about. No need to pull 1000 rows across the network just to aggregate them down to 5 when you can do that at the database and pull your 5 records. While SSRS does a pretty good job of this, the larger the dataset, the longer it takes to get to the SSRS server across the network, and thus the longer the user waits on rendering.

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • Luke L (9/21/2010)


    so the sample data and expected output?

    If you're doing everything at rendering just pull everything, then group by columnA, add your calculations on columns b,c or whatnot. It's about the best I can do without something more concrete from you.

    I'm not sure at rendering is the best place to do all of your aggregation but YMMV depending on how much data you're talking about. No need to pull 1000 rows across the network just to aggregate them down to 5 when you can do that at the database and pull your 5 records. While SSRS does a pretty good job of this, the larger the dataset, the longer it takes to get to the SSRS server across the network, and thus the longer the user waits on rendering.

    -Luke.

    In fact, by selecting only those records which have a unique column A, I'm there, and at rendering I only do some calcs.

    Record ColumnA ColumnB ColumnC

    1 ABC1234 09/01/10 09/02/10

    2 ABC1234 09/01/10 09/03/10

    3 DEF1234 09/02/10 09/04/10

    4 GHI1234 09/05/10 09/05/10

    5 GHI1234 09/05/10 09/06/10

    What I want is to return Record 1 or 2 (it doesn't matter which, I assume whichever is first in the database), Record 3, and Record 4 or 5.

    The normal distinct will return all of these.

    Thank you.

  • Why doesn't it matter which in records 1-2? If it doesn't matter why return that column at all?

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • Luke L (9/21/2010)


    Why doesn't it matter which in records 1-2? If it doesn't matter why return that column at all?

    -Luke.

    Ok,

    ColumnA is a shipping number.

    Each shipping number may have several orders in it.

    Therefore, it is repeated, once for every order. I did not include the order number column, since I don't need it. Each unique shipping number is a load.

    I want to count how many loads there are, and for different delivery date time periods, hence columnB and columnC.

  • tsmith-960032 (9/21/2010)


    Luke L (9/21/2010)


    Why doesn't it matter which in records 1-2? If it doesn't matter why return that column at all?

    -Luke.

    Ok,

    ColumnA is a shipping number.

    Each shipping number may have several orders in it.

    Therefore, it is repeated, once for every order. I did not include the order number column, since I don't need it. Each unique shipping number is a load.

    I want to count how many loads there are, and for different delivery date time periods, hence columnB and columnC.

    Just to play a bit of devil's advocate here. What if instead of

    1 ABC1234 09/01/10 09/02/10

    2 ABC1234 09/01/10 09/03/10

    rows 1 and 2 looked like this?

    1 ABC1234 09/01/10 09/02/10

    2 ABC1234 10/01/10 10/03/10

    Depending on how you are calculating "different delivery date time periods" How do you know which date period to put shipment abc1234 into or do you want it to show up in both?

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • Luke L (9/21/2010)


    tsmith-960032 (9/21/2010)


    Luke L (9/21/2010)


    Why doesn't it matter which in records 1-2? If it doesn't matter why return that column at all?

    -Luke.

    Ok,

    ColumnA is a shipping number.

    Each shipping number may have several orders in it.

    Therefore, it is repeated, once for every order. I did not include the order number column, since I don't need it. Each unique shipping number is a load.

    I want to count how many loads there are, and for different delivery date time periods, hence columnB and columnC.

    Just to play a bit of devil's advocate here. What if instead of

    1 ABC1234 09/01/10 09/02/10

    2 ABC1234 09/01/10 09/03/10

    rows 1 and 2 looked like this?

    1 ABC1234 09/01/10 09/02/10

    2 ABC1234 10/01/10 10/03/10

    Depending on how you are calculating "different delivery date time periods" How do you know which date period to put shipment abc1234 into or do you want it to show up in both?

    Luke, the returned data is in the past, not future. So, in order for example to render load statistics,

    I would need to return DISTINCT shipping numbers to get total loads, and also use Column C for example, to check for time period delivered, for example previous year, current year, last month, etc.

    Yes, I could say

    Select Distinct ColumnA, Count(ColumnB) As CB, Count(ColumnC) As CC

    From Table SoAndSo

    Group By ColumnA

    But then I would lose the use of Columns B & C

  • Try a CTE or temp table to get your DISTINCT then join that back to a SELECT where you query the data and constrain on the MIN or MAX of Column C

  • Daniel Bowlin (9/22/2010)


    Try a CTE or temp table to get your DISTINCT then join that back to a SELECT where you query the data and constrain on the MIN or MAX of Column C

    Thank you for your reply.

    I solved my issue as follows :

    Select Distinct ColumnA, Max(ColumnB) As ColumnB, Max(ColumnC) As ColumnC

    From SoAndSoTable

    Group By ColumnA

    Thanks again Daniel, and of course Thank You Luke.

  • tsmith-960032 (9/22/2010)


    Daniel Bowlin (9/22/2010)


    Try a CTE or temp table to get your DISTINCT then join that back to a SELECT where you query the data and constrain on the MIN or MAX of Column C

    Thank you for your reply.

    I solved my issue as follows :

    Select Distinct ColumnA, Max(ColumnB) As ColumnB, Max(ColumnC) As ColumnC

    From SoAndSoTable

    Group By ColumnA

    Thanks again Daniel, and of course Thank You Luke.

    Sorry, for the late reply, but yes that's why I was trying to get at. When you do your grouping you need to decide which results you want from Columns b and c. The MAX() function will get you there.

    Glad it worked out for you, however using the DISTINCT Keyword and a GROUP BY clause is redundant and I'm somewhat amazed that SQL doesn't throw an error here. I understand that it doesn't 'cause I just tested it, but it still seems like something strange and perhaps may cause you issues down the road. I'd remove the DISTINCT Keyword and just stick with the GROUP BY clause for clarity.

    -Luke.

    To help us help you read this[/url]For better help with performance problems please read this[/url]

  • a

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply