Group by vs. Distinct

  • Which is preferable in case of Group by , Distinct... Can u help me

  • Depends on what you are trying to do. DISTINCT is just an automatic GROUP BY on every column in the SELECT clause, so if that is what you want DISTINCT requires less coding and maintenance. If that isn't what you want your only choice would be GROUP BY.

  • If you're aggregating (sum, avg, count, etc) then Group By.

    If you're looking for unique rows then Distinct.

    This falls into the category of being obvious what you're doing.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • UMG Developer (9/27/2010)


    Depends on what you are trying to do. DISTINCT is just an automatic GROUP BY on every column in the SELECT clause, so if that is what you want DISTINCT requires less coding and maintenance. If that isn't what you want your only choice would be GROUP BY.

    mmmhhh... since DISTINCT does not allow for aggregations I wouldn't say that DISTINCT is doing "an automatic GROUP BY"

    DISTINCT and GROUP BY are two very different tools as pointed out by Gail.

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
  • PaulB-TheOneAndOnly (9/29/2010)


    mmmhhh... since DISTINCT does not allow for aggregations I wouldn't say that DISTINCT is doing "an automatic GROUP BY"

    DISTINCT and GROUP BY are two very different tools as pointed out by Gail.

    How does the result of a DISTINCT query differ from the result of a GROUP BY query if all of the columns in the SELECT clause are put in the GROUP BY clause? (That is essentially what DISTINCT does unless I am missing something.)

    Sure with GROUP BY you can add a HAVING clause, but I have seen people use DISTINCT as a short-cut for grouping by all the returned columns a lot. (Eliminating duplicates.)

  • UMG Developer (9/29/2010)


    PaulB-TheOneAndOnly (9/29/2010)


    mmmhhh... since DISTINCT does not allow for aggregations I wouldn't say that DISTINCT is doing "an automatic GROUP BY"

    DISTINCT and GROUP BY are two very different tools as pointed out by Gail.

    How does the result of a DISTINCT query differ from the result of a GROUP BY query if all of the columns in the SELECT clause are put in the GROUP BY clause? (That is essentially what DISTINCT does unless I am missing something.)

    It's more correct to say the other way around. A group by with all the columns in the group by clause behaves much the same as a distinct when no aggregations are present. Distinct does not do an automatic group by on all columns, because you cannot do this

    SELECT distinct col1, col2, col3, count(*) from SomeTable

    However a group by without aggregations and with all columns in the group by clause does do a distinct (and the exec plans will usually be the same)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (9/30/2010)


    It's more correct to say the other way around. A group by with all the columns in the group by clause behaves much the same as a distinct when no aggregations are present. Distinct does not do an automatic group by on all columns, because you cannot do this

    SELECT distinct col1, col2, col3, count(*) from SomeTable

    However a group by without aggregations and with all columns in the group by clause does do a distinct (and the exec plans will usually be the same)

    But that query would be the same as:

    SELECT col1, col2, col3, count(*) FROM SomeTable GROUP BY col1, col2, col3, count(*);

    Which isn't valid either. (Of course that isn't the error that SQL Server returns, but in essence that is what DISTINCT would accomplish.)

  • One could always specify both:

    SELECT DISTINCT

    C.Title,

    C.FirstName

    FROM Person.Contact C

    GROUP BY

    C.Title,

    C.FirstName;

    😀

  • UMG Developer (9/29/2010)


    Sure with GROUP BY you can add a HAVING clause, but I have seen people use DISTINCT as a short-cut for grouping by all the returned columns a lot. (Eliminating duplicates.)

    Oh... I have seen people doing ugly things too. 😀

    DISTINCT as a shortcut for GROUP BY? wow, that really makes no sense to me.

    Self documentation is key when dealing with code. If I see DISTINCT I know developer wanted to get distinct values; if I see GROUP BY I know developer wanted to aggregate some values.

    I know you can use a knife as an screwdriver but having a screwdriver at hand... please use the right tool. 😉

    _____________________________________
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at Amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply