TSQL Recomile Problems when using subqueries

  • Good Day All

    I have found a problem with a Simple TSQL script that is returning incorrect results due to a schema change that didn't cause the old plan to recompile.

    I will just post how to duplicate the problem and hopefully people can tell me if it is intended to function this way.

    You have an Instance of SQL, on that instance you have 2 Databases, TestDB1 and TestDB2.

    You have a table in each of these databases that are exactly the same (At the start).

    You have a query to select an ID that is in TestDB1 where the values exist in [TestDB2].

    eg. Select AccountID from TestDB1.dbo.Table1 where AccountID in (Select AccountID from TestDB2.dbo.Table2)

    The above query runs and the expected results are returned.. all is well

    Then someone changes the TestDB2.dbo.Table2 Column name changes from AccountID to AccountIDArchive (No idea why it changed but it does)

    At this point i would assume the schema change should invalidate the plan so that if you run the original query with the AccountID being specified in the subquery it would result in a error.. column cannot be found.

    If you run the subquery on its own, it does return an error but if you run the Original query

    Select AccountID from TestDB1.dbo.Table1 where AccountID in (Select AccountID from TestDB2.dbo.Table2)

    It does not give you an error, and runs successfully

    Obviously the Origional queries subquery had to be changed to AccountIDArchive to accommodate for the change but i was surprised to see it still using the old plan and that an invalid TSQL code could execute "successfully", or generating a working plan.

    Using option recompile with the origional query does not cause the query to error out but if you do change the subquery to a new invalid column name it does error out

    Any opinions to prevent this or explanation as to why this is happening would help 🙂

    Regards

  • It's got nothing to do with recompiles, the schema change did invalidate the old plan (schema changes always invalidate all plans relating to that table) and SQL did compile a new one.

    When dealing with a subquery, any column of any table inside or outside the subquery is valid, legal and in scope inside the subquery. Since there's no AccountID in the table in the subquery, the column correctly binds to the table in the outer query. Not what you intended, but correct as far as SQL Server is concerned.

    Your original query:

    Select AccountID

    from TestDB1.dbo.Table1

    where AccountID in (Select AccountID from TestDB2.dbo.Table2)

    Since there's no AccountID in Table2, SQL checks to see if it's a column in Table1. It is, so your query is actually bound and executed as

    Select t1.AccountID

    from TestDB1.dbo.Table1 t1

    where t1.AccountID in (Select t1.AccountID from TestDB2.dbo.Table2 t2)

    Perfectly legal, perfectly valid, just probably not what you intended (this is called a correlated subquery, a subquery that references one or more columns from a table in the outer query.)

    This is why you should always, always, always qualify column names, especially when you have subqueries.

    This will throw the expected error.

    Select t1.AccountID

    from TestDB1.dbo.Table1 t1

    where t1.AccountID in (Select t2.AccountID from TestDB2.dbo.Table2 t2)

    Always, always, always qualify your column names.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks a mill Gail

    That does make perfect sense.

    Regards

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply