Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Intersect, Except, Union, All and Any

By David Poole, (first published: 2009/07/20)

I was reviewing the Microsoft training courses for SQL2008 for a set of application developers who wanted to learn more about SQL Server. In doing so I came across some commands that I either had not heard of or had not used before.

  • INTERSECT
  • EXCEPT
  • ALL
  • ANY

ALL and ANY have been around for donkeys years but INTERSECT and EXCEPT were new to me.

Whenever I find a new set of commands in T-SQL I fire up my developer edition and go straight to Adventureworks to start playing around.

INTERSECT, EXCEPT and UNION

To experiment with these commands I decided to look at two sets of CustomerID values

  • Customers in sales territory 10 (United Kingdom)
  • Sales orders in July 2004, the last month of orders in Adventureworks

Perhaps the best way to show what these commands do when comparing two sets of data is to look at the diagrams below

Predicate Illustration Description
EXCEPT image001 Customers from the UK who have not bought in the last month (July 2004)
INTERSECT INTERSECT Customers from the UK
AND
who have made a purchase in the last month (July 2004)
UNION UNION Customers from the UK
OR
who have made a purchase in the last month (July 2004)

Different ways of writing an EXCEPT query

Although I had not used the EXCEPT before I had achieved the same ends by more traditional methods. As any DBA will know there are many ways of achieving exactly the same thing so I will illustrate 3 solutions.

LEFT JOIN

The following query yielded the results I needed

        SELECT C.CustomerID
        FROM Sales.Customer AS C
        	LEFT JOIN Sales.SalesOrderHeader AS OH
        	ON C.CustomerID = OH.CustomerID
        	AND OrderDate>='2004-07-01' 
        WHERE OH.CustomerID IS NULL 
        AND C.TerritoryID=10
    

Except - LEFT JOIN

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Customer'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 196 ms.
     

The overall query cost was 0.0517279

WHERE CustomerID NOT IN(…)

Being a diligent DBA I wanted to try the query a different way

        SELECT CustomerID
        FROM Sales.Customer
        WHERE TerritoryID=10
        AND CustomerID NOT IN(
        	SELECT customerid
        	FROM  Sales.SalesOrderHeader
        	WHERE OrderDate>='2004-07-01' 
        )
    

EXCEPT NOT IN

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Customer'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 160 ms.

This time the overall query cost was 0.0500904 offering a slight improvement.

EXCEPT

Finally I tried the EXCEPT predicate

        SELECT CustomerID
        FROM Sales.Customer
        WHERE TerritoryID=10
        	EXCEPT
        SELECT customerid
        FROM Sales.SalesOrderHeader
        WHERE OrderDate>='2004-07-01' 
    

image007

Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Customer'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 202 ms.
    

The time for the EXCEPT predicate was 0.0500904. Exactly the same as for the WHERE NOT IN(…) query. You can also see that the execution plans are the same.

Different ways of writing an INTERSECT query

I decided to take the same approach with the INTERSECT query and write the equivalentquery in 3 separate ways.

INNER JOIN

As any customer can have more than one order I am going to have to make a distinct list of CustomerID values. I decided to try a couple of approaches.

        SELECT DISTINCT C.CustomerID
        FROM Sales.Customer AS C
        	INNER JOIN Sales.SalesOrderHeader AS OH
        	ON C.CustomerID = OH.CustomerID
        WHERE
        	C.TerritoryID=10
        	AND OH.OrderDate>='2004-07-01' 

        SELECT C.CustomerID
        FROM Sales.Customer AS C
        	INNER JOIN (SELECT DISTINCT CustomerID FROM Sales.SalesOrderHeader WHERE OrderDate>='2004-07-01' 
        )AS OH
        	ON C.CustomerID = OH.CustomerID
        WHERE C.TerritoryID=10
    

This produced two differing execution plans with an overall cost of 0.0500859 and 0.0513978 respectively.

INNER JOIN

WHERE CustomerID IN(…)

This time the query below yielded an execution cost of 0.0499945

        SELECT CustomerID
        FROM Sales.Customer
        WHERE TerritoryID=10
        AND CustomerID IN(
        	SELECT customerid
        	FROM  Sales.SalesOrderHeader
        	WHERE OrderDate>='2004-07-01' 
        )
    

INTERSECT IN

INTERSECT

Finally, as before the INTERSECT query presented an identical execution plan to the WHERE CustomerID IN ( query

        SELECT CustomerID
        FROM Sales.Customer
        WHERE TerritoryID=10
        	INTERSECT
        SELECT customerid
        FROM Sales.SalesOrderHeader
        WHERE OrderDate>='2004-07-01' 
    

Conclusion for INTERSECT and EXCEPT

Granted Adventureworks is a pretty contrived example and under such conditions the example I have given here INTERSECT and EXCEPT are as good as the best of the rest.

INTERSECT
Predicate Result
INNER JOIN 0.0500859
INNER JOIN (with DISTINCT derived query) 0.0513978
IN 0.0499945
INTERSECT 0.0499945
      
EXCEPT
Predicate Result
LEFT JOIN 0.0517279
NOT IN 0.0500904
EXCEPT 0.0500904

At this stage I cannot think of a compelling reason to use these new predicates over and above the more usual WHERE field NOT IN (…) particularly as the query on both sides of an EXCEPT or INTERSECT has to return the same recordset structure.

I did consider using it as a means of deduplicating two address record sets but that would be an even more contrived example.

The test does emphasize that it is worthwhile trying to write the query more than one way in order to determine the best option for the particular scenario.

The ANY and ALL Predicate

ANY and ALL are predicates I have never needed to use.

ANY

The two queries below offer up the same results and execution plan

        SELECT *
        FROM Sales.SalesPerson
        WHERE TerritoryID = ANY(
        	SELECT TerritoryID FROM Sales.SalesTerritory WHERE CountryRegionCode='US'
        )

        SELECT *
        FROM Sales.SalesPerson
        WHERE TerritoryID IN(
        	SELECT TerritoryID FROM Sales.SalesTerritory WHERE CountryRegionCode='US'
        )
        

To look at how ANY may be used in a more practical sense I decided to create a Sales.TopSales table in Adventureworks using the query below

        SELECT TOP 5 SalesOrderID,TotalDue
        INTO Sales.TopSales
        FROM Sales.SalesOrderHeader
        ORDER BY Sales.TotalDue DESC

        ALTER TABLE Sales.TopSales
        	ADD CONSTRAINT PK_TopSales PRIMARY KEY CLUSTERED( SalesOrderID)

        CREATE INDEX idx_TopSales_TotalDue ON Sales.TopSales(TotalDue DESC)
    

For my example I decided to retrieve sales that exceeded the minimum in my Sales.TopSales table.

        SELECT *
        FROM Sales.SalesOrderHeader
        WHERE TotalDue > ANY(SELECT TotalDue FROM Sales.TopSales)
        ORDER BY Sales.TotalDue DESC
    

An alternative way of writing this query could be as follows

        SELECT *
        FROM Sales.SalesOrderHeader
        WHERE TotalDue > (SELECT MIN(TotalDue) FROM Sales.TopSales)
        ORDER BY Sales.TotalDue DESC
    

The execution plans show considerable differences

  • ANY query cost = 3.47092
  • Alternative query cost = 1.24152

image012

ALL

The ALL predicate allows a comparison to be made against all the values in a SELECT list. The following two queries are identical

        SELECT *
        FROM Sales.SalesOrderHeader
        WHERE TotalDue > ALL(SELECT TotalDue FROM Sales.TopSales)
        ORDER BY Sales.TotalDue DESC 

        SELECT *
        FROM Sales.SalesOrderHeader
        WHERE TotalDue > (SELECT MAX(TotalDue) FROM Sales.TopSales)
        ORDER BY Sales.TotalDue DESC 

Once again the execution plans tell the full story

  • ALL query cost 3.47068
  • Alternative query cost 1.24152

image014

Conclusion for ALL and ANY

ALL and ANY certainly date back to SQL Server 7.0 and possibly before that. I have never found a reason to use them and given my findings I see no benefit to them.

Both predicates insist that their contents are produced from SELECT statements so where an IN predicate can contain a static list or a SELECT statement, ANY can only contain a SELECT statement.

The best I could results I could achieve for the ANY predicate simply equaled those for the more flexible IN predicate.

Rounding it all up

ALL and ANY do not seem to offer any benefits to offset their disadvantages

INTERSECT and EXCEPT do at least offer seem to offer performance on par with alternatives but so far I cannot see a benefit to using them. They are however part of the ANSI standard SQL.

Total article views: 39684 | Views in the last 30 days: 16
 
Related Articles
FORUM

Help writing query logic

Help writing query logic

FORUM

Select query

Select query

FORUM

Query with high lob logical reads / read-ahead reads

Hi I have a table with a column of image data type. Our application fires queries at this table,...

FORUM

Select Smart query

Select Query

FORUM

about logical reads question

logical reads

Tags
except    
intersect    
t-sql    
union    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones