SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

Except and Intersect

EXCEPT and INTERSECT are two uncommon commands. Not that they do anything odd but they aren’t exactly well known in any detail or used very often. I’ve been using them myself a bit recently and thought I would explore.

Per BOL:

EXCEPT returns any distinct values from the left query that are not also found on the right query.
INTERSECT returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operand.

These commands have a very similar feel to UNION and UNION ALL. They all join two queries together and require that the queries have the same number of columns in the same order with similar data types. Also the column names for the output will be taken from the first query.

I frequently see EXCEPT and INTERSECT compared to LEFT OUTER JOIN and JOIN and to be fair they are somewhat similar. There are a couple of important differences though.

Aside from the conditions above (number of columns and datatypes) EXCEPT and INTERSECT do not require a JOIN condition. They also return a distinct list of values, and in the case of the LEFT OUTER JOIN a WHERE clause will be needed.

Another option, and much closer to the actual definitions, is to use EXISTS and NOT EXISTS. In fact if you use a DISTINCT (or GROUP BY) and EXISTS or NOT EXISTS you have almost exactly the same thing. (For example you only have access to the columns in table1 and not those in table2.)

-- Setup code
CREATE TABLE Except1 (colField varchar(50))
CREATE TABLE Except2 (colField varchar(50))

INSERT INTO Except1 VALUES ('A'),('A'),('B'),('D'),('E'),('E'),('G'),('H'),('J'),('K')
INSERT INTO Except2 VALUES ('B'),('C'),('E'),('F'),('H'),('I'),('K'),('L')

SET STATISTICS TIME ON
SET NOCOUNT ON

Except

PRINT '
Query 1'
SELECT colField FROM Except1
EXCEPT
SELECT colField FROM Except2

PRINT '
Query 2'
SELECT DISTINCT Except1.colField
FROM Except1
LEFT OUTER JOIN Except2
	ON Except1.colField= Except2.colField
WHERE Except2.colField IS NULL

PRINT '
Query 3'
SELECT DISTINCT Except1.colField
FROM Except1
WHERE NOT EXISTS (SELECT 1 FROM Except2
				WHERE Except1.colField= Except2.colField)

Except1

Except2

Except3

Query 1

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 24 ms.

Query 2

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 236 ms.

Query 3

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 62 ms.

Three different plans all different but all show about the same percentage. The EXCEPT version is quite a bit faster than the LEFT OUTER JOIN and slightly faster than NOT EXISTS.

Intersect

PRINT '
Query 1'
SELECT colField FROM Except1
INTERSECT
SELECT colField FROM Except2

PRINT '
Query 2'
SELECT DISTINCT Except1.colField
FROM Except1
JOIN Except2
	ON Except1.colField= Except2.colField

PRINT '
Query 3'
SELECT DISTINCT Except1.colField
FROM Except1
WHERE EXISTS (SELECT 1 FROM Except2
				WHERE Except1.colField= Except2.colField)

Intersect1

Intersect2

Intersect3

Query 1

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 16 ms,  elapsed time = 10 ms.

Query 2

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 120 ms.

Query 3

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

So this time the query plan for the JOIN shows the highest percentage. It also has a significantly longer execution time. The other two are pretty similar.
 

Summary

I should point out that with this few records the times are pretty suspect. Not to mention that you should take the percentages within a query plan with a grain of salt as they are frequently unreliable. Another issue is that I’m only dealing with a single column. If you are dealing with multiple columns the ON clauses (and the WHERE clause in the EXISTS/NOT EXISTS) are going to get more and more complex and probably take up more and more overhead. These results, however, do seem to point out that if they will work for your output EXCEPT and INTERSECT are worth taking a looking at.


Filed under: Microsoft SQL Server, SQLServerPedia Syndication, T-SQL Tagged: code language, language sql, microsoft sql server, sql statements, T-SQL

SQLStudies

My name is Kenneth Fisher and I am Senior DBA for a large (multi-national) insurance company. I have been working with databases for over 20 years starting with Clarion and Foxpro. I’ve been working with SQL Server for 12 years but have only really started “studying” the subject for the last 3. I don’t have any real "specialities" but I enjoy trouble shooting and teaching. Thus far I’ve earned by MCITP Database Administrator 2008, MCTS Database Administrator 2005, and MCTS Database Developer 2008. I’m currently studying for my MCITP Database Developer 2008 and should start in on the 2012 exams next year. My blog is at www.sqlstudies.com.

Comments

Leave a comment on the original post [sqlstudies.com, opens in a new window]

Loading comments...