SQL DISTINCT on Multiple Columns

  • Comments posted to this topic are about the item SQL DISTINCT on Multiple Columns

  • Wow... tough crowd. A bit cowardly too. People apparently have, so far, rated this article in the 1 or 2 star range but no one had the nads to say why.

    @Sarvesh,

    This is a good introductory article on DISTINCT and will help many a newbie. Although others have, apparently, not recognized it as such, this is a fine bit of "SQL Spackle" with some very easy to understand examples with output results and explanations as well as the GROUP BY alternatives. Don't let the folks that don't recognize what this article is for discourage you. Keep writing them.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • The input data does not match the examples later in the article. Param Singh is added twice, and this was not the intention.

  • Although the article was well-written and a good introduction to DISTINCT, I didn't give it many stars.

    Why? I found that the article didn't introduce us anything that can't be found on the MSDN pages.

    It would've been nice if the following were included in the article:

    * a critical look on the performance impact of DISTINCT

    * although the alternatives of GROUP BY are listed, it is not explained if one construct is prefered above the other, or if they are equivalent (if you look at the execution plans)

    * the behaviour of DISTINCT with NULL values and in aggregate functions

    However, the article itself was written well, with no obvious spelling mistakes and it had clear examples, so Sarvesh, as Jeff already mentioned, don't let this withhold you to write future articles!

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • The INSERT had these data :

    ('Param', 'Singh', 'B283SP', 'Birmingham'),

    ('Steve', 'White', 'EC224HQ', 'London'),

    ('Mark', 'Smith', 'L324JK', 'Liverpool'),

    ('Claire', 'whitehood', 'M236DM', 'Manchester'),

    ('Param', 'Singh', 'B283SP', 'Birmingham')

    And the Select * from the table yielded

    Sarvesh Singh B263SP Birmingham

    Steve White EC224HQ London

    Mark Smith L324JK Liverpool

    Claire whitehood M236DM Manchester

    Param Singh B283SP Birmingham

    I presume this is a typo.But you need to edit this asap.

  • Koen (da-zero) (2/8/2011)


    Although the article was well-written and a good introduction to DISTINCT, I didn't give it many stars.

    Why? I found that the article didn't introduce us anything that can't be found on the MSDN pages.

    I see it from a slightly different perspective; there are very few articles, here or anywhere, that give any information above what's available in MSDN. Their value is in their ability to explain concepts better than MSDN and to serve up information in bite-sized chunks. Both these approaches make the same information far more readily accessible, and I believe this article is another successful example.

    It would've been nice if the following were included in the article:

    * a critical look on the performance impact of DISTINCT

    * although the alternatives of GROUP BY are listed, it is not explained if one construct is prefered above the other, or if they are equivalent (if you look at the execution plans)

    * the behaviour of DISTINCT with NULL values and in aggregate functions

    However, the article itself was written well, with no obvious spelling mistakes and it had clear examples, so Sarvesh, as Jeff already mentioned, don't let this withhold you to write future articles!

    Here I agree. I'm not entirely sure wandering into execution plans would have been entirely appropriate for the intended audience, but a mention of performance, alternative functions and working with NULLs could have been included to good effect.

    Overall, I think it was a good beginner's introduction to an important SQL concept.

    Semper in excretia, suus solum profundum variat

  • A good introduction to Distinct for novices, though it it really doesnt explain why you would consider using a Group by to get a distinct list, is there a performance differential to consider when using one method or the other?

    I also consider that these two SQL scripts are completely different and thus shouldnt be considered alternatives for each other

    SELECT PostCode, COUNT(Postcode)

    FROM

    (

    SELECT DISTINCT Address1, Address2, City, Postcode

    FROM AddressTable

    ) AS Sub

    GROUP BY Postcode

    HAVING COUNT(Postcode) > 1

    and

    SELECT Address1,Address2,City,PostCode,Count(PostCode)

    FROM AddressTable

    GROUP BY Address1,Address2,City,PostCode

    HAVING Count(PostCode) > 1

    The first one would work on the following list of data

    B23 3SP, 1

    EC1A 3AD, 2.

    which would result in only the EC1A 3AD, 2 row being returned, however the alternative query works on a totally different set of data, whch may be :

    6 some road, some district, B23 6QD, 1

    8 another road, another district, EC1A 3AD, 1

    10 another road, another district, EC1A 3AD, 1

    So with the Having Count()>1 clause you would get no results returned.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • I would agree that this is a good introductory article on DISTINCT and will help many a newbie. A minor section on performance may have been beneficial but it is not critical, as the article is aimed at novice users and getting them to start writing SQL. Tuning is a whole other topic and all they need to know is maybe to use distinct with caution and always try with group by as well to see which is faster (as this is the best indicator for a newbie who wouldn't understand an execution plan as yet).

    Do keep on writing Sarvesh. I did my my first one yesterday and actually as a result learnt some things that I didn't know - which is always a bonus - and got some quality feedback from some very wise and kind folks. If you're brave enough to put your articles out you have to expect some pot shots but it's all a learning curve, so don't let it get you down. I'll be expecting an in-depth article on the performance differences between DISTINCT and GROUP BY from you soon!

    James

    James
    MCM [@TheSQLPimp]

  • To much failures, it's confusing, specially for the newbies!

  • murat.korkmaz (2/8/2011)


    To much failures, it's confusing, specially for the newbies!

    Can you elaborate on that? What were the failures (aside from the typo in the input query) and what parts were confusing?

    Help the author to improve his article.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • murat.korkmaz (2/8/2011)


    To much failures, it's confusing, specially for the newbies!

    Sorry, but it is a clearly written article and far from confusing. Yes there are a few errors, but they are minimal and easily made.

    All in all its a good article and the writer should keep writing them.

    _________________________________________________________________________
    SSC Guide to Posting and Best Practices

  • i liked it, gave 3 stars, maybe a thing to consider is to explain what youre audience for this article is. It prevent of being judge on why you did not explain when to use distinct or group by in certain situations.

    For a basic understanding of the use of distinct on mulitple colums i enjoyed reading it

  • Thank you all for your feedback. It means a lot.

    The purpose of this artcle was to show simple usage of DISTINCT to a beginner.

  • Good article. Straight to the point. Good examples too. Needs to work a little bit more on presentation,spell check and sort order options for output readability and quick comparison. Another small bit of advice, make the habit of using common table expressions. Keep it up. Thank you.

  • Good basic article on DISTINCT. Hadn't thought about using DISTINCT in aggregates--good idea.

Viewing 15 posts - 1 through 15 (of 37 total)

You must be logged in to reply to this topic. Login to reply