UNION vs UNION ALL

Kenneth Fisher, 2014-10-16

You might be wondering why I’m going into such a simple subject. Well the way I see it there are four options here.

  • You already know the difference, it seems really obvious and you are probably wondering why I’m mentioning it.
  • You think you know the difference but it turns out you are wrong (don’t worry, it happens).
  • You don’t know the difference and once I’ve pointed it out you will wonder why on earth you never thought of it before.
  • You don’t care.

 

If you don’t care there isn’t much help I can give you. If you already know what I have to say then you don’t need help. That leaves a 50% chance that you will find this interesting. So here goes.

At its simplest the difference is that UNION returns a distinct list of rows and UNION ALL returns all rows.

--Table setup
CREATE TABLE UnionTable1 (Id Int)
CREATE TABLE UnionTable2 (Id Int)
INSERT INTO UnionTable1 VALUES (2), (4), (6), (8), (10), (12)
INSERT INTO UnionTable2 VALUES (3), (6), (9), (12)
--Union example
SELECT Id AS [UNION] FROM UnionTable1
UNION 
SELECT Id FROM UnionTable2
SELECT Id AS [UNION ALL] FROM UnionTable1
UNION ALL
SELECT Id FROM UnionTable2

Union1Union2

Where things get a little bit interesting is how UNION handles generating that distinct list. You will notice that the UNION output is in order while the UNION ALL is not. In order to generate the distinct list from the queries UNION sorts the values. This means an additional sort operator in the execution plan.

Union3

For comparison here is the execution plan for UNION ALL.

Union4

Notice that the sort operator for the UNION is by far the most expensive part of the whole process.

So what does that mean for you? Unless you actually need to use UNION (Ie you need to get rid of duplicates) then you want to use UNION ALL as it’s the much cheaper and faster option.

There are a couple of exceptions. If you are doing a UNION in an EXISTS clause then SQL knows enough that it doesn’t bother with the sort and the execution times are the same. Also if you are already sorting the output (using an ORDER BY) then most of the cost is already taken care of.

Like I said, this is all fairly simple, and straight forward, but you would be surprised how often people don’t think about it.

Filed under: Microsoft SQL Server, SQLServerPedia Syndication, T-SQL Tagged: code language, language sql, T-SQL

Rate

Share

Share

Rate

Related content

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Question: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? This question was sent to me via email. My reply follows. Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup? Databases to be mirrored are currently running on 2005 SQL instances but will be upgraded to 2008 SQL in the near future.

Robert Davis

2009-02-23

1,567 reads

Networking – Part 4

You may want to read Part 1 , Part 2 , and Part 3 before continuing. This time around I’d like to talk about social networking. We’ll start with social networking. Facebook, MySpace, and Twitter are all good examples of using technology to let…

Andy Warren

2009-02-17

1,530 reads

Speaking at Community Events – More Thoughts

Last week I posted Speaking at Community Events – Time to Raise the Bar?, a first cut at talking about to what degree we should require experience for speakers at events like SQLSaturday as well as when it might be appropriate to add additional focus/limitations on the presentations that are accepted. I’ve got a few more thoughts on the topic this week, and I look forward to your comments.

Andy Warren

2009-02-13

360 reads