SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Not In v Not Equal


Not In v Not Equal

Author
Message
jnaras
jnaras
Old Hand
Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)Old Hand (346 reputation)

Group: General Forum Members
Points: 346 Visits: 78

Hi Ken,

Interesting analysis article. Nice work.

Thanks.

Naras.

Narasimhan Jayachandran





Miriamka
Miriamka
Grasshopper
Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)

Group: General Forum Members
Points: 11 Visits: 46

I ran some of the queries on SQL 2000 and SQL 2005 [on the servers]. I ran it for 50 times only.
I got follow results:

SQL 2000 SP3:

time: 1
NOT IN :1376 ms
AND <> :1796 ms
derived UNION table LEFT OUTER JOIN : 453 ms

time: 50
NOT IN :58110 ms
AND <> :58800 ms
derived UNION table LEFT OUTER JOIN : 43703 ms

SQL 2005 SP1:

time: 1
NOT IN :1220 ms
AND <> :1173 ms
derived UNION table LEFT OUTER JOIN : 420 ms

time: 50
NOT IN :50236 ms
AND <> :53313 ms
derived UNION table LEFT OUTER JOIN : 21466 ms

My experiment also shows that "derived UNION table LEFT OUTER JOIN" is better than 'NOT IN" or "AND <>".....

Thanks,

Miriamka





NickMcNair
NickMcNair
Forum Newbie
Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)Forum Newbie (6 reputation)

Group: General Forum Members
Points: 6 Visits: 113

Guys,

By replacing the union query with a temporary table you can get a further performance enhancement, this implies that a Constant scan is slower than a Table scan.

My results...

Beginning test run...1 NOT IN
Elapsed Time: 121543 ms

Beginning test run...2 <>
Elapsed Time: 132310 ms

Beginning test run...3 LEFT OUTER JOIN
Elapsed Time: 105760 ms

Beginning test run...4 NOT EXISTS (UNION)
Elapsed Time: 96250 ms

Beginning test run...5 NOT EXISTS (temp table)
Elapsed Time: 66633 ms

ta.

Nick





James Rimell
James Rimell
Valued Member
Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)Valued Member (71 reputation)

Group: General Forum Members
Points: 71 Visits: 17
All interesting stuff, but difficult to really tell due to hardware differences. e.g. anyone who has a v.fast temp table setup (i.e. maybe on a many spindle RAID 0 or even RAMDISK) compared to their permanent tables (maybe on RAID5) would see performance differences purely based on IO rather than the T-SQL differences.
Michelle-138172
Michelle-138172
SSC Veteran
SSC Veteran (278 reputation)SSC Veteran (278 reputation)SSC Veteran (278 reputation)SSC Veteran (278 reputation)SSC Veteran (278 reputation)SSC Veteran (278 reputation)SSC Veteran (278 reputation)SSC Veteran (278 reputation)

Group: General Forum Members
Points: 278 Visits: 357
On a different note about Left Join(Not related to the tests here), I have noticed that a Left Join with a not null filter in the Where statement outperforms an Inner Join. Does anyone have had same findings?
Paulo G.
Paulo G.
Forum Newbie
Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)Forum Newbie (5 reputation)

Group: General Forum Members
Points: 5 Visits: 17
For this particular case of continuous values, 'NOT BETWEEN 214 AND 217' runs twice as faster as the best solution (NOT EXISTS) !
Sharon Matyk
Sharon Matyk
SSC-Enthusiastic
SSC-Enthusiastic (108 reputation)SSC-Enthusiastic (108 reputation)SSC-Enthusiastic (108 reputation)SSC-Enthusiastic (108 reputation)SSC-Enthusiastic (108 reputation)SSC-Enthusiastic (108 reputation)SSC-Enthusiastic (108 reputation)SSC-Enthusiastic (108 reputation)

Group: General Forum Members
Points: 108 Visits: 47
This article, and its many replies, is a great example of why I like this site so much! Valuable points are made, with time taken to describe exactly how answers were derived, and comments flesh out the original discussion quite nicely.

Overall, thanks very much for this - Sharon.
Ken Johnson-162971
Ken Johnson-162971
SSC Veteran
SSC Veteran (234 reputation)SSC Veteran (234 reputation)SSC Veteran (234 reputation)SSC Veteran (234 reputation)SSC Veteran (234 reputation)SSC Veteran (234 reputation)SSC Veteran (234 reputation)SSC Veteran (234 reputation)

Group: General Forum Members
Points: 234 Visits: 375
Paulo G., that was a great catch!

BETWEEN didn't even cross my mind when looking at this originally. I was looking more at the generalities of which techniques worked best with a multi-value filter and totally missed the fact that my filter data were sequential.

If we change our sample data to something like (8, 214, 271, 310), NOT EXISTS keeps its performance edge, but the best solution should take into account the filter data, and the BETWEEN solution crushed the NOT EXISTS solution for the given sample (214-217).

Thanks for the new angle!

KenJ
I cant let you do that Dave
I cant let you do that Dave
SSC-Enthusiastic
SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)SSC-Enthusiastic (115 reputation)

Group: General Forum Members
Points: 115 Visits: 271
as an aside set base generation is a tad quicker


set statistics io on
set statistics time on
CREATE TABLE tbl_IN_VS_AND (filterCriterion_sv int PRIMARY KEY CLUSTERED NOT NULL)

Insert tbl_IN_VS_AND
SELECT 100000 * U.I + 10000 * V.I + 1000 * W.I + 100 * X.I + 10 * Y.I + Z.I filterCriterion_sv FROM
(SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) U,
(SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) V,
(SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) W,
(SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) X,
(SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) Y,
(SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) Z

set statistics io off
set statistics time off


at about 10 seconds on a dual core at least.

to compare hardware the first two queries in the post on a baseline of 100 I get
43766 ms and 43750 ms elapsed
Biggest shock was the times of

SELECT @results = count(filterCriterion_sv)
FROM tbl_IN_VS_AND
WHERE filterCriterion_sv NOT IN (Select 214 I UNION SELECT 215 UNION SELECT 216 UNION SELECT 217)

SELECT @results = count(filterCriterion_sv)
FROM tbl_IN_VS_AND
WHERE filterCriterion_sv NOT IN (214,215,216,217)

were 27176 ms
and 43486 ms

while

SELECT @results = count(filterCriterion_sv)
FROM tbl_IN_VS_AND
WHERE NOT EXISTS(SELECT * FROM
(
SELECT 214 AS filterValue_val UNION ALL
SELECT 215 UNION ALL
SELECT 216 UNION ALL
SELECT 217 ) AS tbl
WHERE tbl.filterValue_val = tbl_IN_VS_AND.filterCriterion_sv )

ended up was still 14470 ms so twice as good as the union

I guess it must not construct the in predicate into a working table before starting the query. (until you tell it to via union)
VM-446915
VM-446915
Grasshopper
Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)

Group: General Forum Members
Points: 11 Visits: 15
In my experience it is always faster to join with "exists" then do a painful "in". "In" is only good for several values. With ssis, you can actually use a pre-existing table to insert your values into and join aginst.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search