I’ve grown up reading Tom Clancy and probably most of you have at least seen Red October, so this book caught my eye when browsing used books for a recent trip. It’s a fairly human look at what’s involved in sailing on a Trident missile submarine…
You might be wondering why I’m going into such a simple subject. Well the way I see it there are four options here.
- You already know the difference, it seems really obvious and you are probably wondering why I’m mentioning it.
- You think you know the difference but it turns out you are wrong (don’t worry, it happens).
- You don’t know the difference and once I’ve pointed it out you will wonder why on earth you never thought of it before.
- You don’t care.
If you don’t care there isn’t much help I can give you. If you already know what I have to say then you don’t need help. That leaves a 50% chance that you will find this interesting. So here goes.
At its simplest the difference is that UNION returns a distinct list of rows and UNION ALL returns all rows.
--Table setup CREATE TABLE UnionTable1 (Id Int) CREATE TABLE UnionTable2 (Id Int) INSERT INTO UnionTable1 VALUES (2), (4), (6), (8), (10), (12) INSERT INTO UnionTable2 VALUES (3), (6), (9), (12)
--Union example SELECT Id AS [UNION] FROM UnionTable1 UNION SELECT Id FROM UnionTable2 SELECT Id AS [UNION ALL] FROM UnionTable1 UNION ALL SELECT Id FROM UnionTable2
Where things get a little bit interesting is how UNION handles generating that distinct list. You will notice that the UNION output is in order while the UNION ALL is not. In order to generate the distinct list from the queries UNION sorts the values. This means an additional sort operator in the execution plan.
For comparison here is the execution plan for UNION ALL.
Notice that the sort operator for the UNION is by far the most expensive part of the whole process.
So what does that mean for you? Unless you actually need to use UNION (Ie you need to get rid of duplicates) then you want to use UNION ALL as it’s the much cheaper and faster option.
There are a couple of exceptions. If you are doing a UNION in an EXISTS clause then SQL knows enough that it doesn’t bother with the sort and the execution times are the same. Also if you are already sorting the output (using an ORDER BY) then most of the cost is already taken care of.
Like I said, this is all fairly simple, and straight forward, but you would be surprised how often people don’t think about it.