November 23, 2011 at 9:03 am
i have a table that looks like the one below
table1
__________________________________________
nm key address abc efg xxx
__________________________________________
aa 1 bks hj 2 e
aa 1 as g 0 d
aa 1 ghn a 1 v
bb 2 qas c 6 b
bb 2 qqq l 7 h
cc 3 dcf u 5 a
cc 3 bhu z 1 s
cc 3 opa w 8 d
dd 4 zz x 1 f
dd 4 xx p 1 j
e 5 op s 0 k
f 6 il d 9 c
g 7 nm s 3 m
h 8 cv g 4 n
now i want to exclude all the records that has multiple nm and key
and i like the result set to be
e 5 op s 0 k
f 6 il d 9 c
g 7 nm s 3 m
h 8 cv g 4 n
can any one help me with how to write the query to get the above mentioned result set
thanks
November 23, 2011 at 9:08 am
Are you familiar with IN clauses, and with Group By and Having?
select *
from dbo.MyTable
where MyColumn not in (select MyColumn from dbo.MyTable Group By MyColumn Having count(*) > 1);
Alternative, a Where Not Exists might perform better, and is a valid option.
Another would be a self-outer join with a Where PK_Column Is Null clause.
You'll need to test each possible solution to see which works best on your data.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
November 23, 2011 at 9:18 am
thanks for the quick response , and i forgot to mention i am using two comparisons as well
with the where clause
eg
where
col1 is null
and
col2 <= 0
so how do i go ahead and write the query ?
November 23, 2011 at 9:18 am
Plenty of ways to do this.
DECLARE @table1 AS TABLE (nm CHAR(2), INT, address CHAR(3), abc CHAR(2), efg INT, xxx CHAR(1))
INSERT INTO @table1
SELECT 'aa', 1, 'bks', 'hj', 2, 'e'
UNION ALL SELECT 'aa', 1, 'as', 'g', 0, 'd'
UNION ALL SELECT 'aa', 1, 'ghn', 'a', 1, 'v'
UNION ALL SELECT 'bb', 2, 'qas', 'c', 6, 'b'
UNION ALL SELECT 'bb', 2, 'qqq', 'l', 7, 'h'
UNION ALL SELECT 'cc', 3, 'dcf', 'u', 5, 'a'
UNION ALL SELECT 'cc', 3, 'bhu', 'z', 1, 's'
UNION ALL SELECT 'cc', 3, 'opa', 'w', 8, 'd'
UNION ALL SELECT 'dd', 4, 'zz', 'x', 1, 'f'
UNION ALL SELECT 'dd', 4, 'xx', 'p', 1, 'j'
UNION ALL SELECT 'e', 5, 'op', 's', 0, 'k'
UNION ALL SELECT 'f', 6, 'il', 'd', 9, 'c'
UNION ALL SELECT 'g', 7, 'nm', 's', 3, 'm'
UNION ALL SELECT 'h', 8, 'cv', 'g', 4, 'n'
SELECT *
FROM @table1
WHERE nm IN (SELECT nm
FROM @table1
GROUP BY nm
HAVING COUNT(*) = 1)
--EDIT--
Ahhh, beaten to the punch.
November 23, 2011 at 9:20 am
ohh yeh i got it , never mind
thanks again
November 23, 2011 at 10:07 am
Cadavre (11/23/2011)
Ahhh, beaten to the punch.
Not necessarily. On my sample data, using IN outperforms using NOT IN by a significant margin. I assume that this will mainly be determined by how many duplicated versus unique records there are in the data. My data has a lot of repeats, so using IN is more selective.
Here are the estimated costs (and the IO, TIME statistics roughly math)
NOT EXISTS 14%
SELF JOIN 14%
IN 14%
COUNT/OVER 20%
NOT IN 39%
The NOT EXISTS and SELF JOIN had almost identical estimates and stats, but the exact distributions of costs was slightly different. The IN had a slightly more complicated execution plan which also showed up in the stats, but it wasn't enough of a difference to bump up the percentage. COUNT/OVER uses a CTE with the windowing functions, but it didn't perform well, so I'm not including details. NOT IN performed the worst at almost twice the cost of the second least efficient method. Of course, these results may be very different with a different set of data.
Drew
Drew
J. Drew Allen
Business Intelligence Analyst
Philadelphia, PA
Viewing 6 posts - 1 through 6 (of 6 total)
You must be logged in to reply to this topic. Login to reply