Log in  ::  Register  ::  Not logged in

## UNION and Duplicate values

 Author Message t.franz SSCrazy Group: General Forum Members Points: 2454 Visits: 398 Comments posted to this topic are about the item UNION and Duplicate values Stewart "Arturius" Campbell SSC Guru Group: General Forum Members Points: 54627 Visits: 7914 Nice, easy one to start the day off on, thanks, FransUsing a DISTINCT with a UNION add about as much value as using it in a subquery (especially a correlated subquery)e.g. `SELECT Col1, Col2 FROM dbo.TheTable WHERE Col3 IN (SELECT Col1, Col2 FROM dbo.TheTable WHERE Col3 IN (SELECT DISTINCT Col2 FROM dbo.Table2 WHERE ...) Col2 FROM dbo.Table2 WHERE ...)` ____________________________________________Space, the final frontier? not any more...All limits henceforth are self-imposed.“libera tute vulgaris ex” Budd SSCrazy Group: General Forum Members Points: 2075 Visits: 747 Nice easy Question, but again it makes me think...Which of these the the better performs better, and what factors determine that?SELECT DISTINCT * FROM (VALUES (1), (2), (3), (1)) AS t1(id)UNION ALLSELECT DISTINCT * FROM (VALUES (4), (5)) AS t2(iOR THISSELECT * FROM (VALUES (1), (2), (3), (1)) AS t1(id)UNIONSELECT * FROM (VALUES (4), (5)) AS t2(id)d) Budd SSCrazy Group: General Forum Members Points: 2075 Visits: 747 +xBudd - Tuesday, June 12, 2018 7:30 AMNice easy Question, but again it makes me think...Which of these performs better, and what factors determine that?SELECT DISTINCT * FROM (VALUES (1), (2), (3), (1)) AS t1(id)UNION ALLSELECT DISTINCT * FROM (VALUES (4), (5)) AS t2(iOR THISSELECT * FROM (VALUES (1), (2), (3), (1)) AS t1(id)UNIONSELECT * FROM (VALUES (4), (5)) AS t2(id)d)sorry about the TYPO's t.franz SSCrazy Group: General Forum Members Points: 2454 Visits: 398 it depends :-), but both queries could return different results when are used with real tables.Your first query (with UNION ALL) could return duplicates, when you have the same ids in the first and second subquery (e.g. when you add a (2) into the value list after the (5)). The second query (with UNION) would eliminate this duplicates too.So it is not valid to compare the performance, since you are comparing different (but similar) things.PS: if you are SURE, that there are only different IDs in both subqueries, the UNION ALL query would be slightly faster, since it is easier to eliminate duplicates in a e.g. 10 and another 20 items list, than in a 30 items list (in a 10 item list, you have to compare the first with the second until the 10th, the second value with 3rd until 10th and so on, so you have 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 = 45 possible combinations; in a 20 items list there are 209 combinations and in a 30 items list 464) Budd SSCrazy Group: General Forum Members Points: 2075 Visits: 747 +xt.franz - Tuesday, June 12, 2018 7:42 AMit depends :-), but both queries could return different results when are used with real tables.Your first query (with UNION ALL) could return duplicates, when you have the same ids in the first and second subquery (e.g. when you add a (2) into the value list after the (5)). The second query (with UNION) would eliminate this duplicates too.So it is not valid to compare the performance, since you are comparing different (but similar) things.DUH, You are so correct. and now I see it. I should have thought that through a little better before posting. Time for more coffee.. :-) JustMarie SSCertifiable Group: General Forum Members Points: 6212 Visits: 1475 Thanks for the deceptively simple question. I know I'll use this reminder going forward. Until I forget again..