Blog Post

Except and Intersect

,

EXCEPT and INTERSECT are two uncommon commands. Not that they do anything odd but they aren’t exactly well known in any detail or used very often. I’ve been using them myself a bit recently and thought I would explore.

Per BOL:

EXCEPT returns any distinct values from the left query that are not also found on the right query.

INTERSECT returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operand.

These commands have a very similar feel to UNION and UNION ALL. They all join two queries together and require that the queries have the same number of columns in the same order with similar data types. Also the column names for the output will be taken from the first query.

I frequently see EXCEPT and INTERSECT compared to LEFT OUTER JOIN and JOIN and to be fair they are somewhat similar. There are a couple of important differences though.

Aside from the conditions above (number of columns and datatypes) EXCEPT and INTERSECT do not require a JOIN condition. They also return a distinct list of values, and in the case of the LEFT OUTER JOIN a WHERE clause will be needed.

Another option, and much closer to the actual definitions, is to use EXISTS and NOT EXISTS. In fact if you use a DISTINCT (or GROUP BY) and EXISTS or NOT EXISTS you have almost exactly the same thing. (For example you only have access to the columns in table1 and not those in table2.)

-- Setup code
CREATE TABLE Except1 (colField varchar(50))
CREATE TABLE Except2 (colField varchar(50))
INSERT INTO Except1 VALUES ('A'),('A'),('B'),('D'),('E'),('E'),('G'),('H'),('J'),('K')
INSERT INTO Except2 VALUES ('B'),('C'),('E'),('F'),('H'),('I'),('K'),('L')
SET STATISTICS TIME ON
SET NOCOUNT ON

Except

PRINT '
Query 1'
SELECT colField FROM Except1
EXCEPT
SELECT colField FROM Except2
PRINT '
Query 2'
SELECT DISTINCT Except1.colField
FROM Except1
LEFT OUTER JOIN Except2
ON Except1.colField= Except2.colField
WHERE Except2.colField IS NULL
PRINT '
Query 3'
SELECT DISTINCT Except1.colField
FROM Except1
WHERE NOT EXISTS (SELECT 1 FROM Except2
WHERE Except1.colField= Except2.colField)

Except1

Except2

Except3

Query 1
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 24 ms.
Query 2
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 236 ms.
Query 3
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 62 ms.

Three different plans all different but all show about the same percentage. The EXCEPT version is quite a bit faster than the LEFT OUTER JOIN and slightly faster than NOT EXISTS.

Intersect

PRINT '
Query 1'
SELECT colField FROM Except1
INTERSECT
SELECT colField FROM Except2
PRINT '
Query 2'
SELECT DISTINCT Except1.colField
FROM Except1
JOIN Except2
ON Except1.colField= Except2.colField
PRINT '
Query 3'
SELECT DISTINCT Except1.colField
FROM Except1
WHERE EXISTS (SELECT 1 FROM Except2
WHERE Except1.colField= Except2.colField)

Intersect1

Intersect2

Intersect3

Query 1
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
 SQL Server Execution Times:
   CPU time = 16 ms,  elapsed time = 10 ms.
Query 2
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 120 ms.
Query 3
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.
 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 0 ms.

So this time the query plan for the JOIN shows the highest percentage. It also has a significantly longer execution time. The other two are pretty similar.

 

Summary

I should point out that with this few records the times are pretty suspect. Not to mention that you should take the percentages within a query plan with a grain of salt as they are frequently unreliable. Another issue is that I’m only dealing with a single column. If you are dealing with multiple columns the ON clauses (and the WHERE clause in the EXISTS/NOT EXISTS) are going to get more and more complex and probably take up more and more overhead. These results, however, do seem to point out that if they will work for your output EXCEPT and INTERSECT are worth taking a looking at.

Filed under: Microsoft SQL Server, SQLServerPedia Syndication, T-SQL Tagged: code language, language sql, microsoft sql server, sql statements, T-SQL

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating