SQL DISTINCT on Multiple Columns

Question

SQL DISTINCT on Multiple Columns

ss-457805

SSCertifiable

Points: 5872
More actions
February 7, 2011 at 10:10 pm

#234174

Comments posted to this topic are about the item SQL DISTINCT on Multiple Columns
blog: http://sarveshsingh.com Twitter: @sarveshsing

Viewing 15 posts - 1 through 15 (of 38 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1004749 More actions · Answer 1

Wow... tough crowd. A bit cowardly too. People apparently have, so far, rated this article in the 1 or 2 star range but no one had the nads to say why.

@Sarvesh,

This is a good introductory article on DISTINCT and will help many a newbie. Although others have, apparently, not recognized it as such, this is a fine bit of "SQL Spackle" with some very easy to understand examples with output results and explanations as well as the GROUP BY alternatives. Don't let the folks that don't recognize what this article is for discourage you. Keep writing them.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

nwilske SSC Journeyman Points: 81 More actions · Answer 2

The input data does not match the examples later in the article. Param Singh is added twice, and this was not the intention.

Koen Verbeeck SSC Guru Points: 259215 More actions · Answer 3

Although the article was well-written and a good introduction to DISTINCT, I didn't give it many stars.

Why? I found that the article didn't introduce us anything that can't be found on the MSDN pages.

It would've been nice if the following were included in the article:

* a critical look on the performance impact of DISTINCT

* although the alternatives of GROUP BY are listed, it is not explained if one construct is prefered above the other, or if they are equivalent (if you look at the execution plans)

* the behaviour of DISTINCT with NULL values and in aggregate functions

However, the article itself was written well, with no obvious spelling mistakes and it had clear examples, so Sarvesh, as Jeff already mentioned, don't let this withhold you to write future articles!

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

srivatsa.hg SSC Journeyman Points: 92 More actions · Answer 4

The INSERT had these data :

('Param', 'Singh', 'B283SP', 'Birmingham'),

('Steve', 'White', 'EC224HQ', 'London'),

('Mark', 'Smith', 'L324JK', 'Liverpool'),

('Claire', 'whitehood', 'M236DM', 'Manchester'),

('Param', 'Singh', 'B283SP', 'Birmingham')

And the Select * from the table yielded

Sarvesh Singh B263SP Birmingham

Steve White EC224HQ London

Mark Smith L324JK Liverpool

Claire whitehood M236DM Manchester

Param Singh B283SP Birmingham

I presume this is a typo.But you need to edit this asap.

majorbloodnock SSCrazy Eights Points: 9389 More actions · Answer 5

Koen (da-zero) (2/8/2011)
Although the article was well-written and a good introduction to DISTINCT, I didn't give it many stars.
Why? I found that the article didn't introduce us anything that can't be found on the MSDN pages.

I see it from a slightly different perspective; there are very few articles, here or anywhere, that give any information above what's available in MSDN. Their value is in their ability to explain concepts better than MSDN and to serve up information in bite-sized chunks. Both these approaches make the same information far more readily accessible, and I believe this article is another successful example.

It would've been nice if the following were included in the article:
* a critical look on the performance impact of DISTINCT
* although the alternatives of GROUP BY are listed, it is not explained if one construct is prefered above the other, or if they are equivalent (if you look at the execution plans)
* the behaviour of DISTINCT with NULL values and in aggregate functions
However, the article itself was written well, with no obvious spelling mistakes and it had clear examples, so Sarvesh, as Jeff already mentioned, don't let this withhold you to write future articles!

Here I agree. I'm not entirely sure wandering into execution plans would have been entirely appropriate for the intended audience, but a mention of performance, alternative functions and working with NULLs could have been included to good effect.

Overall, I think it was a good beginner's introduction to an important SQL concept.

Semper in excretia, suus solum profundum variat

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 6

A good introduction to Distinct for novices, though it it really doesnt explain why you would consider using a Group by to get a distinct list, is there a performance differential to consider when using one method or the other?

I also consider that these two SQL scripts are completely different and thus shouldnt be considered alternatives for each other

SELECT PostCode, COUNT(Postcode)

FROM

(

SELECT DISTINCT Address1, Address2, City, Postcode

FROM AddressTable

) AS Sub

GROUP BY Postcode

HAVING COUNT(Postcode) > 1

and

SELECT Address1,Address2,City,PostCode,Count(PostCode)

FROM AddressTable

GROUP BY Address1,Address2,City,PostCode

HAVING Count(PostCode) > 1

The first one would work on the following list of data

B23 3SP, 1

EC1A 3AD, 2.

which would result in only the EC1A 3AD, 2 row being returned, however the alternative query works on a totally different set of data, whch may be :

6 some road, some district, B23 6QD, 1

8 another road, another district, EC1A 3AD, 1

10 another road, another district, EC1A 3AD, 1

So with the Having Count()>1 clause you would get no results returned.

_________________________________________________________________________
SSC Guide to Posting and Best Practices

James A Skipwith SSCommitted Points: 1808 More actions · Answer 7

I would agree that this is a good introductory article on DISTINCT and will help many a newbie. A minor section on performance may have been beneficial but it is not critical, as the article is aimed at novice users and getting them to start writing SQL. Tuning is a whole other topic and all they need to know is maybe to use distinct with caution and always try with group by as well to see which is faster (as this is the best indicator for a newbie who wouldn't understand an execution plan as yet).

Do keep on writing Sarvesh. I did my my first one yesterday and actually as a result learnt some things that I didn't know - which is always a bonus - and got some quality feedback from some very wise and kind folks. If you're brave enough to put your articles out you have to expect some pot shots but it's all a learning curve, so don't let it get you down. I'll be expecting an in-depth article on the performance differences between DISTINCT and GROUP BY from you soon!

James

James
MCM [@TheSQLPimp]

murat.korkmaz SSC Enthusiast Points: 108 More actions · Answer 8

To much failures, it's confusing, specially for the newbies!

Koen Verbeeck SSC Guru Points: 259215 More actions · Answer 9

murat.korkmaz (2/8/2011)
To much failures, it's confusing, specially for the newbies!

Can you elaborate on that? What were the failures (aside from the typo in the input query) and what parts were confusing?

Help the author to improve his article.

Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP

Jason-299789 SSC-Insane Points: 21601 More actions · Answer 10

murat.korkmaz (2/8/2011)
To much failures, it's confusing, specially for the newbies!

Sorry, but it is a clearly written article and far from confusing. Yes there are a few errors, but they are minimal and easily made.

All in all its a good article and the writer should keep writing them.

_________________________________________________________________________
SSC Guide to Posting and Best Practices

Marco V SSCrazy Points: 2939 More actions · Answer 11

i liked it, gave 3 stars, maybe a thing to consider is to explain what youre audience for this article is. It prevent of being judge on why you did not explain when to use distinct or group by in certain situations.

For a basic understanding of the use of distinct on mulitple colums i enjoyed reading it

ss-457805 SSCertifiable Points: 5872 More actions · Answer 12

Thank you all for your feedback. It means a lot.

The purpose of this artcle was to show simple usage of DISTINCT to a beginner.

blog: http://sarveshsingh.com Twitter: @sarveshsing

Thato SSC Veteran Points: 288 More actions · Answer 13

Good article. Straight to the point. Good examples too. Needs to work a little bit more on presentation,spell check and sort order options for output readability and quick comparison. Another small bit of advice, make the habit of using common table expressions. Keep it up. Thank you.

brichardson Grasshopper Points: 14 More actions · Answer 14

Good basic article on DISTINCT. Hadn't thought about using DISTINCT in aggregates--good idea.