1) I don't think it's necessary to scope the column in the subquery. Actually you can only select the columns from the tables in the FROM subclause in the sub-query.
That is not true. You can also reference columns from the outer query. That is called a correlated subquery, and many people use them on a daily basis.
It does not make sense to select columns from the tables refereneced by the main query. What result set are you expecting from a query like:
SELECT * FROM sys.server_principals b WHERE b.sid=(SELECT b.[sid] FROM sys.databases a WHERE a.name=DB_NAME())
Exactly one row in sys.databases will match the WHERE clause. For this row, b.[sid] will be taken from the outer query and used as the result of the subquey. This is of course always equal to b.sid, so the WHERE clause is True for each and every row in sys.server_principals. The result set I expect from this query would be every single row in sys.server_principals.
2) Since I cannot reach SQL Server, can anyone run the following and let me what you get:
CREATE TABLE Ta (Ca1 int not null, Ca2 int not null)
CREATE TABLE Tb (Cb1 int not null, Cb2 int not null)
INSERT Ta (Ca1,Ca2) VALUES (1,1)
INSERT Ta (Ca1,Ca2) VALUES (2,2)
INSERT Tb (Cb1,Cb2) VALUES (2,2)
INSERT Tb (Cb1,Cb2) VALUES (3,3)
If we make a mistake in the following query
SELECT * FROM Ta WHERE Ca1 IN (SELECT Cb1 FROM Tb WHERE Cb2=2)
and write it as:
SELECT * FROM Ta WHERE Ca1 IN (SELECT Ca1 FROM Tb WHERE Cb2=2)
What result you get? Does SQL Server report error?
The first query returns this (as expected):
The second query returns this, as expected by me (but probably not by you):
3) As I said in the article, it's talking about a scenario when a developer makes a mistake and SQL Server does not report the problem. If SQL server impliments the sub-query as a join (as mentioned by Hugo), it's an even bigger issue. The sub-query returns a scalar value in the example. It has nothing to do with the main query.
SQL Server doesn't necessarily implement the subquery as a join (though that is, in fact, exactly what the optimizer will do in many cases - but it's irrelevant here). It assumes that you wanted to write a correlated subquery, since you are referencing columns from the outer query in the subquery. I'd love there to be a provision that references to outer queries always require explicit prefixing of table name or table alias, but such a provision is unfortunately not in any of the SQL standards that I know of (SQL-92, SQL-1999, and SQL-2003), and is not implemented by Microsoft either.
In fact, I'd put it one step further and say that I'd prefer it to be required to prefix each column in every query that uses more than a single table. That would prevent lots of those pesky, hard to find bugs caused by typos.
Thanks for taking the time to respond. Enjoy the rest of your holiday, I'll still be here, ready to read your reply, when you get back.
Hugo Kornelis, SQL Server MVP
Visit my SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis