T-SQL correlated queries

  • Comments posted to this topic are about the item T-SQL correlated queries

  • I had to cheat to get this and actually run the code. I got suspicious when I couldn't see the answer I was expecting "Invalid column name 'c1'."
    Personally I think that's a bug, because the inner select returns no rows and should therefore be equivalent to 

    select c1
    from dbo.t1
    where c1 in ()

    which returns a syntax error (but I can also understand an empty set as there's nothing inside the IN statement).

  • Toby Harman - Wednesday, March 13, 2019 10:40 PM

    I had to cheat to get this and actually run the code. I got suspicious when I couldn't see the answer I was expecting "Invalid column name 'c1'."
    Personally I think that's a bug, because the inner select returns no rows and should therefore be equivalent to 

    select c1
    from dbo.t1
    where c1 in ()

    which returns a syntax error (but I can also understand an empty set as there's nothing inside the IN statement).

    Nice question, thanks Evgeny.

    Tony, this is not a bug, but a side effect of correlated subqueries.
    all columns from the main SELECT are available for use in the subselect, so it will compile.
    per this extract from Microsoft docs:

    Important

    If a column is referenced in a subquery that does not exist in the table referenced by the subquery's FROM clause, but exists in a table referenced by the outer query's FROM clause, the query executes without error. SQL Server implicitly qualifies the column in the subquery with the table name in the outer query.

    ____________________________________________
    Space, the final frontier? not any more...
    All limits henceforth are self-imposed.
    “libera tute vulgaris ex”

  • Thanks Stuart
    So the subquery becomes (SELECT t1.c1 from t2) which makes even less sense to me!
    I appreciate that t1 is in scope and therefore can be parsed and used, and that MS have documented this behaviour, but I doubt many would find it desirable!

  • Toby Harman - Thursday, March 14, 2019 12:03 AM

    Thanks Stuart
    So the subquery becomes (SELECT t1.c1 from t2) which makes even less sense to me!
    I appreciate that t1 is in scope and therefore can be parsed and used, and that MS have documented this behaviour, but I doubt many would find it desirable!

    Toby it might make more sense if you were doing something like this:
    (SELECT t1.c1 * t2.c1 FROM t2)

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Nice example to demonstrate that you should name your tables and columns well, to avoid confusion.

  • Another reason why you should always prefix the column names with the table name (or alias) .

  • Jonathan AC Roberts - Thursday, March 14, 2019 5:48 AM

    Another reason why you should always prefix the column names with the table name (or alias) .

    Indeed!
    select t1.c1
    from dbo.t1
    where t1.c1 in (select t2.c1 from t2)

    That would give the expected error.
    I'll admit: if I had seen the error I expected in the options, I might not have looked at the code closely enough to get it right.


    Puto me cogitare, ergo puto me esse.
    I think that I think, therefore I think that I am.

  • Thanks for this instructive question!

    - webrunner

    -------------------
    A SQL query walks into a bar and sees two tables. He walks up to them and asks, "Can I join you?"
    Ref.: http://tkyte.blogspot.com/2009/02/sql-joke.html

  • Jonathan AC Roberts - Thursday, March 14, 2019 5:48 AM

    Another reason why you should always prefix the column names with the table name (or alias) .

    Definitely. Similar to why I always put brackets around logic statements to make it very clear the order of precedence otherwise one small change elsewhere can break an entire query...

  • Stewart "Arturius" Campbell - Wednesday, March 13, 2019 11:44 PM

    Nice question, thanks Evgeny.

    Tony, this is not a bug, but a side effect of correlated subqueries.
    all columns from the main SELECT are available for use in the subselect, so it will compile.
    per this extract from Microsoft docs:

    Important

    If a column is referenced in a subquery that does not exist in the table referenced by the subquery's FROM clause, but exists in a table referenced by the outer query's FROM clause, the query executes without error. SQL Server implicitly qualifies the column in the subquery with the table name in the outer query.

    Thanks Stewart for clear explanation and reference to Microsoft docs.
    The explanation given in QotD is not accurate.

  • Toby Harman - Thursday, March 14, 2019 12:03 AM

    Thanks Stuart
    So the subquery becomes (SELECT t1.c1 from t2) which makes even less sense to me!
    I appreciate that t1 is in scope and therefore can be parsed and used, and that MS have documented this behaviour, but I doubt many would find it desirable!

    Many of us find it desirable when we're actually designing correlated subqueries. It certainly makes no sense to use a column from the outer query in the column list (unless it is in a complex calculation). But it makes sense when using them on other sections on the query. e.g. the alternative to the correct query.

    SELECT c1
    FROM dbo.t1
    WHERE EXISTS ( SELECT * FROM dbo.t2
                   WHERE t1.c1 = t2.c2);

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • This is a "Works As Designed" feature that makes me wonder just what the logic was for that design.

    40 years of experience shows a pattern to this; there is a warning in the documentation which is a hint that it's just how it is, and by documenting it any weird behaviour becomes WAD instead of a bug.

    Usually this kind of thing comes up during Beta testing, and we push the manufacturer to document it because it's so damn weird that no-one would expect it to happen.

  • paul 69344 wrote:

    This is a "Works As Designed" feature that makes me wonder just what the logic was for that design. 40 years of experience shows a pattern to this; there is a warning in the documentation which is a hint that it's just how it is, and by documenting it any weird behaviour becomes WAD instead of a bug. Usually this kind of thing comes up during Beta testing, and we push the manufacturer to document it because it's so damn weird that no-one would expect it to happen.

     

    Which aspect of correlated subquery activity are you criticizing here Paul?

    It always seemed perfectly sensible to me, apart from an omission - when a column is available from either the inner or the outer query, SQL Server should require aliasing and error with "Ambigous columnname" where appropriate. This alone would eliminate most of the disputes over correlated subquery peccadilloes.

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply