|
|
|
Old Hand
      
Group: General Forum Members
Last Login: Monday, March 04, 2013 1:48 PM
Points: 334,
Visits: 72
|
|
Hi Ken, Interesting analysis article. Nice work. Thanks. Naras. Narasimhan Jayachandran
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Tuesday, January 15, 2013 12:29 PM
Points: 11,
Visits: 25
|
|
I ran some of the queries on SQL 2000 and SQL 2005 [on the servers]. I ran it for 50 times only. I got follow results: SQL 2000 SP3: time: 1 NOT IN :1376 ms AND <> :1796 ms derived UNION table LEFT OUTER JOIN : 453 ms time: 50 NOT IN :58110 ms AND <> :58800 ms derived UNION table LEFT OUTER JOIN : 43703 ms SQL 2005 SP1: time: 1 NOT IN :1220 ms AND <> :1173 ms derived UNION table LEFT OUTER JOIN : 420 ms time: 50 NOT IN :50236 ms AND <> :53313 ms derived UNION table LEFT OUTER JOIN : 21466 ms My experiment also shows that "derived UNION table LEFT OUTER JOIN" is better than 'NOT IN" or "AND <>"..... Thanks, Miriamka
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Friday, June 14, 2013 2:13 AM
Points: 2,
Visits: 85
|
|
Guys, By replacing the union query with a temporary table you can get a further performance enhancement, this implies that a Constant scan is slower than a Table scan. My results... Beginning test run...1 NOT IN Elapsed Time: 121543 ms Beginning test run...2 <> Elapsed Time: 132310 ms Beginning test run...3 LEFT OUTER JOIN Elapsed Time: 105760 ms Beginning test run...4 NOT EXISTS (UNION) Elapsed Time: 96250 ms Beginning test run...5 NOT EXISTS (temp table) Elapsed Time: 66633 ms
ta. Nick
|
|
|
|
|
SSC Rookie
      
Group: General Forum Members
Last Login: Friday, February 03, 2012 5:53 AM
Points: 41,
Visits: 17
|
|
| All interesting stuff, but difficult to really tell due to hardware differences. e.g. anyone who has a v.fast temp table setup (i.e. maybe on a many spindle RAID 0 or even RAMDISK) compared to their permanent tables (maybe on RAID5) would see performance differences purely based on IO rather than the T-SQL differences.
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Today @ 10:02 AM
Points: 113,
Visits: 187
|
|
| On a different note about Left Join(Not related to the tests here), I have noticed that a Left Join with a not null filter in the Where statement outperforms an Inner Join. Does anyone have had same findings?
|
|
|
|
|
Forum Newbie
      
Group: General Forum Members
Last Login: Sunday, February 03, 2013 1:24 AM
Points: 4,
Visits: 11
|
|
| For this particular case of continuous values, 'NOT BETWEEN 214 AND 217' runs twice as faster as the best solution (NOT EXISTS) !
|
|
|
|
|
SSC Journeyman
      
Group: General Forum Members
Last Login: Thursday, August 26, 2010 10:00 AM
Points: 92,
Visits: 47
|
|
This article, and its many replies, is a great example of why I like this site so much! Valuable points are made, with time taken to describe exactly how answers were derived, and comments flesh out the original discussion quite nicely.
Overall, thanks very much for this - Sharon.
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Tuesday, June 11, 2013 9:08 AM
Points: 164,
Visits: 253
|
|
Paulo G., that was a great catch!
BETWEEN didn't even cross my mind when looking at this originally. I was looking more at the generalities of which techniques worked best with a multi-value filter and totally missed the fact that my filter data were sequential.
If we change our sample data to something like (8, 214, 271, 310), NOT EXISTS keeps its performance edge, but the best solution should take into account the filter data, and the BETWEEN solution crushed the NOT EXISTS solution for the given sample (214-217).
Thanks for the new angle!
KenJ
|
|
|
|
|
SSC Journeyman
      
Group: General Forum Members
Last Login: Monday, May 20, 2013 11:36 PM
Points: 85,
Visits: 252
|
|
as an aside set base generation is a tad quicker
set statistics io on set statistics time on CREATE TABLE tbl_IN_VS_AND (filterCriterion_sv int PRIMARY KEY CLUSTERED NOT NULL)
Insert tbl_IN_VS_AND SELECT 100000 * U.I + 10000 * V.I + 1000 * W.I + 100 * X.I + 10 * Y.I + Z.I filterCriterion_sv FROM (SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) U, (SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) V, (SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) W, (SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) X, (SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) Y, (SELECT 0 I UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) Z
set statistics io off set statistics time off
at about 10 seconds on a dual core at least.
to compare hardware the first two queries in the post on a baseline of 100 I get 43766 ms and 43750 ms elapsed Biggest shock was the times of
SELECT @results = count(filterCriterion_sv) FROM tbl_IN_VS_AND WHERE filterCriterion_sv NOT IN (Select 214 I UNION SELECT 215 UNION SELECT 216 UNION SELECT 217)
SELECT @results = count(filterCriterion_sv) FROM tbl_IN_VS_AND WHERE filterCriterion_sv NOT IN (214,215,216,217)
were 27176 ms and 43486 ms
while
SELECT @results = count(filterCriterion_sv) FROM tbl_IN_VS_AND WHERE NOT EXISTS(SELECT * FROM ( SELECT 214 AS filterValue_val UNION ALL SELECT 215 UNION ALL SELECT 216 UNION ALL SELECT 217 ) AS tbl WHERE tbl.filterValue_val = tbl_IN_VS_AND.filterCriterion_sv )
ended up was still 14470 ms so twice as good as the union
I guess it must not construct the in predicate into a working table before starting the query. (until you tell it to via union)
|
|
|
|
|
Grasshopper
      
Group: General Forum Members
Last Login: Saturday, July 19, 2008 3:54 PM
Points: 11,
Visits: 15
|
|
| In my experience it is always faster to join with "exists" then do a painful "in". "In" is only good for several values. With ssis, you can actually use a pre-existing table to insert your values into and join aginst.
|
|
|
|