So, now knowing how a query is processed internally, how can we use this knowledge to improve query performance or perhaps troubleshooting unexpected results?
I think there are two standout things to learn from this:
1. Since JOIN happens before WHERE a more complete ON clause may be more efficient as there will be fewer rows to filter out in the WHERE clause.
2. Since GROUP BY happens before SELECT and DISTINCT happens after SELECT. These:
a) SELECT DISTINCT A,B,C FROM TABLE
b) SELECT A,B,C FROM TABLE GROUP BY A,B,C
Are not necessarily going to have the same efficiency. This surprises me a little as my naive assumption would be that the optimizer would convert the DISTINCT into a GROUP BY.