February 14, 2020 at 5:51 am
I have a situation where SQL Server should throw an error but, instead, returns an incorrect result. I simplified it down to this:
CREATE TABLE #PROVINCES (ProvinceCode CHAR(2), ProvinceName VARCHAR(200))
CREATE TABLE #CITIES (ProvinceCode CHAR(2), CityName VARCHAR(100) )
INSERT INTO #PROVINCES VALUES ('NY','New York')
INSERT INTO #PROVINCES VALUES ('FL','Florida')
INSERT INTO #PROVINCES VALUES ('BC','British Columbia')
INSERT INTO #PROVINCES VALUES ('ON','Ontario')
INSERT INTO #CITIES VALUES ('FL','Miami')
INSERT INTO #CITIES VALUES ('NY','New York')
INSERT INTO #CITIES VALUES ('BC','Manitoba')
INSERT INTO #CITIES VALUES ('ON','Toronto')
SELECT * FROM #CITIES WHERE ProvinceCode IN (SELECT ProvinceCode FROM #PROVINCES WHERE ProvinceName = 'Florida')
SELECT * FROM #CITIES WHERE ProvinceCode IN (SELECT ProvinceCode FROM #PROVINCES WHERE CityName = 'Toronto')
The first SELECT returns "Miami", as it should. The second SELECT does NOT throw an error, even though "CityName" is not a column in the #PROVINCES table. Instead, it incorrectly returns "Toronto". This only happens when the invalid column name in the subselect is a valid column name in the outer table.
Does anyone understand what is going on here? This is insidious. A syntax error in the query causes the query to return an incorrect result, instead of throwing an error. There is no way to detect this syntax error unless you rigorously cross-check your query results and find something that does not make any sense.
February 14, 2020 at 6:19 am
That is a simple question to answer. The reason it doesn't throw an error in the second query is because of how SQL Server works with subqueries. Since you are not using table aliases in your queries, since SQL Server cannot find the column name specified in the subquery it looks to the outside query for the column name which it finds.
This is one of the reasons you should always use table aliases. You would catch the error if your code looked like this:
-- This query will work
SELECT
c.*
FROM
#CITIES AS c
WHERE
c.ProvinceCode IN (SELECT
p.ProvinceCode
FROM
#PROVINCES AS p
WHERE
p.ProvinceName = 'Florida');
-- The following query will error
SELECT
c.*
FROM
#CITIES AS c
WHERE
c.ProvinceCode IN (SELECT
p.ProvinceCode
FROM
#PROVINCES p
WHERE
p.CityName = 'Toronto');
February 14, 2020 at 8:49 am
I did not know this... are there any ms documents around it?
MVDBA
February 14, 2020 at 11:50 am
Your query is equivalent to this:
SELECT *
FROM #CITIES
WHERE ProvinceCode IN (SELECT ProvinceCode
FROM #PROVINCES
WHERE #CITIES.CityName = 'Toronto')
February 14, 2020 at 1:17 pm
I did not know this... are there any ms documents around it?
Not that I've seen. The observed behaviour is necessary to support correlated subqueries, where columns from both inside and outside of the subquery have to be visible. As Lynn has pointed out, you only really fall foul of the OP's example if you fail to use table aliases.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
February 14, 2020 at 1:22 pm
In situations like this SQL Server will use the most local column it can find if there are no aliases. So if the column name is present on both #PROVINCES and #CITIES it would pick the column on #PROVINCES, but if it doesn't exist on #PROVINCES it will use the one in #CITIES. I can't find the documentation that states this but I have definitely seen it documented. Good advice by everyone else to always alias columns.
February 14, 2020 at 3:34 pm
In situations like this SQL Server will use the most local column it can find if there are no aliases. So if the column name is present on both #PROVINCES and #CITIES it would pick the column on #PROVINCES, but if it doesn't exist on #PROVINCES it will use the one in #CITIES. I can't find the documentation that states this but I have definitely seen it documented. Good advice by everyone else to always alias columns.
I knew I read it too and had to go find it. It's in the Qualifying column names in subqueries section in the Subqueries documentation:
Sue
February 14, 2020 at 3:41 pm
If you had written the query like this the error would have appeared.
SELECT *
FROM #CITIES C
INNER JOIN #PROVINCES P ON C.ProvinceCode - P.ProvinceCode
AND C.CityName = 'Toronto'
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
Viewing 8 posts - 1 through 7 (of 7 total)
You must be logged in to reply to this topic. Login to reply
This website stores cookies on your computer.
These cookies are used to improve your website experience and provide more personalized services to you, both on this website and through other media.
To find out more about the cookies we use, see our Privacy Policy