Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

which is the easy way to Eliminate the duplicates? Expand / Collapse
Author
Message
Posted Tuesday, October 16, 2012 8:13 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, May 10, 2013 5:41 AM
Points: 12, Visits: 32
Hi,
Please suggest me to get the distinct values from below query

declare @T as table(id int identity, code varchar(10))

insert into @T(code) values('111')
insert into @T(code) values('111')
insert into @T(code) values('222')
insert into @T(code) values('222')
insert into @T(code) values('333')

select * from @T

I want the result as
------------------
111
222
333
Post #1373292
Posted Tuesday, October 16, 2012 8:14 AM
Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 6:22 AM
Points: 3,845, Visits: 1,086
use distinct clause in select


Post #1373294
Posted Tuesday, October 16, 2012 8:59 AM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Monday, October 28, 2013 5:13 AM
Points: 61, Visits: 721
select distinct code from @T

Post #1373332
Posted Tuesday, October 16, 2012 4:16 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 7:09 AM
Points: 35,399, Visits: 31,960
lokesha.b (10/16/2012)
Hi,
Please suggest me to get the distinct values from below query

declare @T as table(id int identity, code varchar(10))

insert into @T(code) values('111')
insert into @T(code) values('111')
insert into @T(code) values('222')
insert into @T(code) values('222')
insert into @T(code) values('333')

select * from @T

I want the result as
------------------
111
222
333


For what you're showing, the others are correct. DISTINCT will do it.

The real question is, what are you actually doing? I ask because the use of DISTINCT is a usual indication that a joined query was written incorrectly or the underlying database design really needs some work.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1373569
Posted Tuesday, October 16, 2012 11:07 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, May 10, 2013 5:41 AM
Points: 12, Visits: 32
I'm Sorry, I confudsed the question; however, my target is to return both the values like ID and Code result should be as

ID Code
-- -----
1 111
3 222
5 333


Querry is
---------
declare @T as table(ID int identity, Code varchar(10))

insert into @T(code) values('111')
insert into @T(code) values('111')
insert into @T(code) values('222')
insert into @T(code) values('222')
insert into @T(code) values('333')

select * from @T

result
------

ID Code
-- ----
1 111
3 222
5 333




Post #1373622
Posted Tuesday, October 16, 2012 11:29 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 3:02 AM
Points: 3,421, Visits: 5,361
I recommend that you take heed of Jeff's question above. An improper JOIN often generates tons of duplicates and developers can be too lazy to understand why and eliminate them the proper way.

To return the id number with the code, you can do this:

;WITH Results AS (
SELECT id, code
,rn=ROW_NUMBER() OVER (PARTITION BY code ORDER BY id)
FROM @T)
SELECT id, code
FROM Results
WHERE rn=1





My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1373627
Posted Wednesday, October 17, 2012 1:47 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Saturday, October 25, 2014 11:42 AM
Points: 2,380, Visits: 7,591
dwain.c (10/16/2012)
I recommend that you take heed of Jeff's question above. An improper JOIN often generates tons of duplicates and developers can be too lazy to understand why and eliminate them the proper way.

To return the id number with the code, you can do this:

;WITH Results AS (
SELECT id, code
,rn=ROW_NUMBER() OVER (PARTITION BY code ORDER BY id)
FROM @T)
SELECT id, code
FROM Results
WHERE rn=1




Why bother with the row_number?
SELECT MIN(id) AS id, code
FROM @T
GROUP BY code;

Performance test: -
IF object_id('tempdb..#T') IS NOT NULL
BEGIN
DROP TABLE #T;
END;

SELECT TOP 1000000 IDENTITY(INT,1,1) AS id,
CAST((ABS(CHECKSUM(NEWID())) % 100) + 1 AS VARCHAR(3)) AS code
INTO #T
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3;

DECLARE @HOLDER INT, @HOLDER2 VARCHAR(3);

PRINT 'ROWNUMBER';
SET STATISTICS IO, TIME ON;

;WITH Results AS (
SELECT id, code
,rn=ROW_NUMBER() OVER (PARTITION BY code ORDER BY id)
FROM #T)
SELECT @HOLDER = id, @HOLDER2 = code
FROM Results
WHERE rn=1;

SET STATISTICS IO, TIME OFF;

PRINT 'MIN';
SET STATISTICS IO, TIME ON;

SELECT @HOLDER = MIN(id), @HOLDER2 = code
FROM #T
GROUP BY code;

SET STATISTICS IO, TIME OFF;

Results: -
ROWNUMBER
Table '#T'. Scan count 5, logical reads 2341, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 4797 ms, elapsed time = 2547 ms.
MIN
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#T'. Scan count 5, logical reads 2341, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 626 ms, elapsed time = 164 ms.


Again, but with indexes: -
IF object_id('tempdb..#T') IS NOT NULL
BEGIN
DROP TABLE #T;
END;

SELECT TOP 1000000 IDENTITY(INT,1,1) AS id,
CAST((ABS(CHECKSUM(NEWID())) % 100) + 1 AS VARCHAR(3)) AS code
INTO #T
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3;

CREATE CLUSTERED INDEX cl_t_id ON #T(id);
CREATE NONCLUSTERED INDEX nc_t_code ON #T(code);

DECLARE @HOLDER INT, @HOLDER2 VARCHAR(3);

PRINT 'ROWNUMBER';
SET STATISTICS IO, TIME ON;

;WITH Results AS (
SELECT id, code
,rn=ROW_NUMBER() OVER (PARTITION BY code ORDER BY id)
FROM #T)
SELECT @HOLDER = id, @HOLDER2 = code
FROM Results
WHERE rn=1;

SET STATISTICS IO, TIME OFF;

PRINT 'MIN';
SET STATISTICS IO, TIME ON;

SELECT @HOLDER = MIN(id), @HOLDER2 = code
FROM #T
GROUP BY code;

SET STATISTICS IO, TIME OFF;

Results: -
ROWNUMBER
Table '#T'. Scan count 1, logical reads 1979, physical reads 0, read-ahead reads 15, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 422 ms, elapsed time = 409 ms.
MIN
Table '#T'. Scan count 1, logical reads 1979, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 391 ms, elapsed time = 402 ms.


So roughly equivalent once indexes are in place, but wildly different before they are.



Not a DBA, just trying to learn

For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/

For better, quicker answers on SQL Server performance related questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/



If you litter your database queries with nolock query hints, are you aware of the side effects?
Try reading a few of these links...

(*) Missing rows with nolock
(*) Allocation order scans with nolock
(*) Consistency issues with nolock
(*) Transient Corruption Errors in SQL Server error log caused by nolock
(*) Dirty reads, read errors, reading rows twice and missing rows with nolock


LinkedIn | Blog coming soon (for sufficiently large values of "soon" )!
Post #1373661
Posted Wednesday, October 17, 2012 3:26 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 3:02 AM
Points: 3,421, Visits: 5,361
Cadavre (10/17/2012)
dwain.c (10/16/2012)
I recommend that you take heed of Jeff's question above. An improper JOIN often generates tons of duplicates and developers can be too lazy to understand why and eliminate them the proper way.

To return the id number with the code, you can do this:

;WITH Results AS (
SELECT id, code
,rn=ROW_NUMBER() OVER (PARTITION BY code ORDER BY id)
FROM @T)
SELECT id, code
FROM Results
WHERE rn=1




Why bother with the row_number?
SELECT MIN(id) AS id, code
FROM @T
GROUP BY code;

Performance test: -
IF object_id('tempdb..#T') IS NOT NULL
BEGIN
DROP TABLE #T;
END;

SELECT TOP 1000000 IDENTITY(INT,1,1) AS id,
CAST((ABS(CHECKSUM(NEWID())) % 100) + 1 AS VARCHAR(3)) AS code
INTO #T
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3;

DECLARE @HOLDER INT, @HOLDER2 VARCHAR(3);

PRINT 'ROWNUMBER';
SET STATISTICS IO, TIME ON;

;WITH Results AS (
SELECT id, code
,rn=ROW_NUMBER() OVER (PARTITION BY code ORDER BY id)
FROM #T)
SELECT @HOLDER = id, @HOLDER2 = code
FROM Results
WHERE rn=1;

SET STATISTICS IO, TIME OFF;

PRINT 'MIN';
SET STATISTICS IO, TIME ON;

SELECT @HOLDER = MIN(id), @HOLDER2 = code
FROM #T
GROUP BY code;

SET STATISTICS IO, TIME OFF;

Results: -
ROWNUMBER
Table '#T'. Scan count 5, logical reads 2341, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 4797 ms, elapsed time = 2547 ms.
MIN
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#T'. Scan count 5, logical reads 2341, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 626 ms, elapsed time = 164 ms.


Again, but with indexes: -
IF object_id('tempdb..#T') IS NOT NULL
BEGIN
DROP TABLE #T;
END;

SELECT TOP 1000000 IDENTITY(INT,1,1) AS id,
CAST((ABS(CHECKSUM(NEWID())) % 100) + 1 AS VARCHAR(3)) AS code
INTO #T
FROM master.dbo.syscolumns sc1, master.dbo.syscolumns sc2, master.dbo.syscolumns sc3;

CREATE CLUSTERED INDEX cl_t_id ON #T(id);
CREATE NONCLUSTERED INDEX nc_t_code ON #T(code);

DECLARE @HOLDER INT, @HOLDER2 VARCHAR(3);

PRINT 'ROWNUMBER';
SET STATISTICS IO, TIME ON;

;WITH Results AS (
SELECT id, code
,rn=ROW_NUMBER() OVER (PARTITION BY code ORDER BY id)
FROM #T)
SELECT @HOLDER = id, @HOLDER2 = code
FROM Results
WHERE rn=1;

SET STATISTICS IO, TIME OFF;

PRINT 'MIN';
SET STATISTICS IO, TIME ON;

SELECT @HOLDER = MIN(id), @HOLDER2 = code
FROM #T
GROUP BY code;

SET STATISTICS IO, TIME OFF;

Results: -
ROWNUMBER
Table '#T'. Scan count 1, logical reads 1979, physical reads 0, read-ahead reads 15, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 422 ms, elapsed time = 409 ms.
MIN
Table '#T'. Scan count 1, logical reads 1979, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 391 ms, elapsed time = 402 ms.


So roughly equivalent once indexes are in place, but wildly different before they are.


Nice effort Cadavre! Interesting results too (with the indexing). The former doesn't surprise me at all - wish I'd thought of it.



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1373702
Posted Wednesday, October 17, 2012 8:15 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 7:09 AM
Points: 35,399, Visits: 31,960
Just my 2 cents on indexing. Be careful when using only indexing to "tune" a query especially on OLTP tables. I'm not suggesting that you should avoid such a thing but if you can make a query run much faster by code rather than by index, you've not only decreased index maintenance time a bit, but you've also made INSERTs run a little faster (sometimes, a lot faster... I've seen some indexes actually cause timeouts because of extent splitting) and you've made the code more bullet proof for the future. Another advantage of increasing performance by code instead of by index is space savings on disk and backup/restore times.

There are times where indexing is absolutely necessary and there are times where you really should bite the bullet and write some alternative code. If the code is using DISTINCT, really consider the latter.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1373818
Posted Wednesday, October 17, 2012 7:00 PM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: Today @ 3:02 AM
Points: 3,421, Visits: 5,361
Jeff Moden (10/17/2012)
Just my 2 cents on indexing. Be careful when using only indexing to "tune" a query especially on OLTP tables. I'm not suggesting that you should avoid such a thing but if you can make a query run much faster by code rather than by index, you've not only decreased index maintenance time a bit, but you've also made INSERTs run a little faster (sometimes, a lot faster... I've seen some indexes actually cause timeouts because of extent splitting) and you've made the code more bullet proof for the future. Another advantage of increasing performance by code instead of by index is space savings on disk and backup/restore times.

There are times where indexing is absolutely necessary and there are times where you really should bite the bullet and write some alternative code. If the code is using DISTINCT, really consider the latter.


Sounds like the techie explanation of my second signature line!



My mantra: No loops! No CURSORs! No RBAR! Hoo-uh!

My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.


Need to UNPIVOT? Why not CROSS APPLY VALUES instead?
Since random numbers are too important to be left to chance, let's generate some!
Learn to understand recursive CTEs by example.
Splitting strings based on patterns can be fast!
Post #1374096
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse