Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


The Pitfall of "Not Equal To" Operator in Queries!


The Pitfall of "Not Equal To" Operator in Queries!

Author
Message
Yaroslav Pentsarskyy-353753
Yaroslav Pentsarskyy-353753
Forum Newbie
Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)Forum Newbie (7 reputation)

Group: General Forum Members
Points: 7 Visits: 3
Comments posted to this topic are about the content posted at temp


Regards,

Yaroslav

nileshsane
nileshsane
SSC-Enthusiastic
SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)SSC-Enthusiastic (143 reputation)

Group: General Forum Members
Points: 143 Visits: 43

Nice article.

Another way of writing the same, for those who prefer the old convention (I dont recommend it).

SELECT s.*, se.* FROM Students s ,

(SELECT * FROM StudentExam WHERE ExamName='SQL Server') se

where s.StID !=se.StID




What I hear I forget, what I see I remember, what I do I understand
Vladan
Vladan
SSCommitted
SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)SSCommitted (1.9K reputation)

Group: General Forum Members
Points: 1881 Visits: 754

I'm not sure why one would use precisely this form... IMHO it is more complicated than necessary, and the complication does not bring any advantages :
SELECT s.* FROM Students s
LEFT JOIN (SELECT StID FROM StudentExam WHERE ExamName='SQL Server') se ON s.StID=se.StID
WHERE se.StID IS NULL

I would prefer to use one of these variants:
variant 1) - simplest
SELECT s.* FROM Students s
LEFT JOIN StudentExam E ON E.stID=s.stID AND ExamName='SQL Server'
WHERE E.StID IS NULL

variant 2) - to avoid possible duplicates
SELECT s.* FROM Students s
LEFT JOIN (SELECT DISTINCT StID FROM StudentExam WHERE ExamName='SQL Server') se
ON s.StID=se.StID
WHERE se.StID IS NULL

Is there any reason why to use the code as shown in the article?

Yes I know that if condition is NOT, there can't be any duplicities because I only take those that don't have any corresponding rows... but generally when speaking about similar JOINs, duplicities are things that can cause problems - so I thought I'll mention that... also because it is the main reason why/when I would use the derived table. Otherwise, variant 1 should be good enough.





Duray AKAR
Duray AKAR
Grasshopper
Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)Grasshopper (11 reputation)

Group: General Forum Members
Points: 11 Visits: 1

SELECT s.* FROM Students s
JOIN StudentExam se
ON s.StID=se.StID
WHERE se.ExamName<>'SQL Server'

is definitely NOT the query to be used to retrieve:
The students that has not taken "SQL Server" exam
in the given data architecture...

It is a query to get:
The students that has taken an exam that is NOT "SQL Server".

And it successfully delivers the expected result.
What if there are students that has not taken ANY exams ?

So, I don't think this is a pitfall of "<>" operator at all !

The query should be in the final form that the author suggests to begin with....


Tatsu
Tatsu
Old Hand
Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)Old Hand (344 reputation)

Group: General Forum Members
Points: 344 Visits: 307

Here is how I would answer the original question:

select s.* from Students s
where not exists
(select 1 from StudentExam se
where se.ExamName = 'SQL Server'
and s.StID=se.StID)

While I have seen some performance issues when using an in (select ...) clause exists seems to work great.



Bryant E. Byrd, BSSE MCDBA MCAD
Business Intelligence Administrator
MSBI Administration Blog
Mike C
Mike C
UDP Broadcaster
UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)UDP Broadcaster (1.5K reputation)

Group: General Forum Members
Points: 1497 Visits: 1168

Here's your sample tables with indexes on them. Should eliminate the table scans in your query plan.

CREATE TABLE Students (
StID INT NOT NULL PRIMARY KEY NONCLUSTERED,
StName NVARCHAR(50) NOT NULL)
GO

CREATE CLUSTERED INDEX IX_Students
ON Students (StName)
GO

INSERT Students VALUES (1,'Jack')
INSERT Students VALUES (2,'Anna')
INSERT Students VALUES (3,'Bob')
GO

CREATE TABLE StudentExam (
StID INT NOT NULL,
ExamName VARCHAR(50) NOT NULL,
PRIMARY KEY (StID, ExamName))
GO

INSERT StudentExam VALUES (1,'SQL Server')
INSERT StudentExam VALUES (2,'VB.NET')
INSERT StudentExam VALUES (2,'C#.NET')
INSERT StudentExam VALUES (1,'XML')
GO

Then run your query and see if it comes back with a more efficient query plan.


Jason Hopkins
Jason Hopkins
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 1
Subqueries are usually a bad idea as a first option- how about a left join, perhaps with a DISTINCT in the SELECT clause?
Dan Knowlton
Dan Knowlton
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 22
The point of the article is excellent. It reinforces the fact that we as SQL programmers need to be careful and be sure of the results that our queries will return. It's a reminder to us veterans to not be complacent and whip out some crappy code when we're in a hurry. It's a nice lesson for new programmers who could easily fall into the trap of using the "Not Equal To" operator when they should really be using a sub-query or outer join.
Donald Eberhart
Donald Eberhart
Forum Newbie
Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)Forum Newbie (1 reputation)

Group: General Forum Members
Points: 1 Visits: 9

Ehsan,

Bob does not show up in your results.

Regards,

Don


Andy DBA
Andy DBA
SSC Veteran
SSC Veteran (200 reputation)SSC Veteran (200 reputation)SSC Veteran (200 reputation)SSC Veteran (200 reputation)SSC Veteran (200 reputation)SSC Veteran (200 reputation)SSC Veteran (200 reputation)SSC Veteran (200 reputation)

Group: General Forum Members
Points: 200 Visits: 754

If you "Check out" Ehsan's example you'll find Jack is retrieved even though he did take the class and, even though Bob did not take the class, he is not retrieved.

"Stupid people compose a query and expect other unusual result "

Vladan's variant 1 example returns the correct results because the ExamName criteria is in the join, NOT the where clause. This causes all StudentExam columns in the join's result set to be NULL for students without a 'SQL Server' exam. The example then filters out the non-null rows from the result set with the where clause.

In Ehsan's example, putting the ExamName <> 'SQL Server' criteria in the where clause filters out all rows where ExamName = 'SQL Server'. This allows Jack to still be retrieved because there is a row in the result set where ExamName <> 'SQL Server'.

This also causes Bob to be filtered out because all of the StudentExam columns in the result set are NULL for Bob (who didn't take any exams). ANY COMPARISON OF A NON-NULL TO A NULL IS NULL. So, ExamName <> 'SQL Server' evaluates to NULL even though it seems like it should evaluate to TRUE and the row is filtered out. Bye bye Bob.

There's more good stuff in BOL on NULL comparisons in the "Comparison Search Conditions" subtopic under "null values" in the index.

Finally, Ehsan's example will return duplicates for students taking more than one exam other than 'SQL Server'. The beauty of Vladan's example is that you only get one row in the result set for each row in the "left" table without a match in the "right" table so "select distinct" is not required.





Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search