SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

UNION vs UNION ALL

You might be wondering why I’m going into such a simple subject. Well the way I see it there are four options here.

  • You already know the difference, it seems really obvious and you are probably wondering why I’m mentioning it.
  • You think you know the difference but it turns out you are wrong (don’t worry, it happens).
  • You don’t know the difference and once I’ve pointed it out you will wonder why on earth you never thought of it before.
  • You don’t care.

 
If you don’t care there isn’t much help I can give you. If you already know what I have to say then you don’t need help. That leaves a 50% chance that you will find this interesting. So here goes.

At its simplest the difference is that UNION returns a distinct list of rows and UNION ALL returns all rows.

--Table setup
CREATE TABLE UnionTable1 (Id Int)
CREATE TABLE UnionTable2 (Id Int)

INSERT INTO UnionTable1 VALUES (2), (4), (6), (8), (10), (12)
INSERT INTO UnionTable2 VALUES (3), (6), (9), (12)
--Union example
SELECT Id AS [UNION] FROM UnionTable1
UNION 
SELECT Id FROM UnionTable2

SELECT Id AS [UNION ALL] FROM UnionTable1
UNION ALL
SELECT Id FROM UnionTable2

Union1Union2

Where things get a little bit interesting is how UNION handles generating that distinct list. You will notice that the UNION output is in order while the UNION ALL is not. In order to generate the distinct list from the queries UNION sorts the values. This means an additional sort operator in the execution plan.

Union3

For comparison here is the execution plan for UNION ALL.

Union4

Notice that the sort operator for the UNION is by far the most expensive part of the whole process.

So what does that mean for you? Unless you actually need to use UNION (Ie you need to get rid of duplicates) then you want to use UNION ALL as it’s the much cheaper and faster option.

There are a couple of exceptions. If you are doing a UNION in an EXISTS clause then SQL knows enough that it doesn’t bother with the sort and the execution times are the same. Also if you are already sorting the output (using an ORDER BY) then most of the cost is already taken care of.

Like I said, this is all fairly simple, and straight forward, but you would be surprised how often people don’t think about it.


Filed under: Microsoft SQL Server, SQLServerPedia Syndication, T-SQL Tagged: code language, language sql, T-SQL

SQLStudies

My name is Kenneth Fisher and I am Senior DBA for a large (multi-national) insurance company. I have been working with databases for over 20 years starting with Clarion and Foxpro. I’ve been working with SQL Server for 12 years but have only really started “studying” the subject for the last 3. I don’t have any real "specialities" but I enjoy trouble shooting and teaching. Thus far I’ve earned by MCITP Database Administrator 2008, MCTS Database Administrator 2005, and MCTS Database Developer 2008. I’m currently studying for my MCITP Database Developer 2008 and should start in on the 2012 exams next year. My blog is at www.sqlstudies.com.

Comments

Leave a comment on the original post [sqlstudies.com, opens in a new window]

Loading comments...