Puzzled by T-SQL

Live blogging this a bit as I try things. This will update a bit, so you’ll have to read through.

Adam Machanic posted this: T-SQL Puzzle-How many rows will this return? SELECT*FROM(VALUES(1),(2))AS x(i)WHERE EXISTS(SELECT MAX(i)FROM(VALUES(1))AS y(i)WHERE y.i=x.i)

I was in a doctor’s office waiting at the time, but I responded that I didn’t think one row was right. I didn’t have the chance to see what happened, so I couldn’t reason through what was happening. Maybe I should have been able to? Not sure.

I got home and ran this (thanks, SQL Prompt):

SELECT
     *
   FROM     ( VALUES ( 1), ( 2) ) AS x ( i )
   WHERE     EXISTS ( SELECT  MAX(y.i)
                    FROM
                    ( VALUES ( 1) ) AS y ( i )
               WHERE
                 y.i = x.i );

I get two rows back, a 1 and a 2. Very strange.

I tried experimenting a bit. I created tables and put data in there. Maybe there’s something I don’t get in the VALUES() clause.

CREATE TABLE mytable99 (id INT);
CREATE TABLE mytable999 (id INT);
GO
INSERT dbo.mytable99
( id )
VALUES
( 1 ), (2);
INSERT dbo.mytable999
( id )
VALUES
( 1 )
;
GO
SELECT
*
FROM
dbo.mytable99 AS x
WHERE
EXISTS ( SELECT MAX(y.id)
        FROM dbo.mytable999 AS y
        WHERE y.id = x.id );

Same result.

Hmmm, Adam added a clue. Why does select max(1) work?

I was guessing that max() operates on the scalar set of [1]. However I’m not sure.

I then did this:

UPDATE dbo.mytable999 SET id = 9;

When I re-ran the query, still two rows. Without a match.

Next I added another row to the first table.

INSERT dbo.mytable99
( id)
VALUES
( 3 );

Now I get three rows.

What I’m guessing (at this point) is that the correlated subquery returns a 1 for every row of the first table, so this means I get the size and shape of that table. The EXISTS() is always satisfied.

I’ll be interested to learn what is happening, or if I’m right.

Update: I’m not.

Or semi-right.

The exists is satisfied, but why?

Adam posted a second hint asking me to remove the max(). I did that and got 1 row. Well that’s interesting. How does the aggregate affect the correlated subquery, and how does this affect the Exists().

I decided to break down the query inside the EXISTS(). I did this with scalar values. Since I’ve been dealing with two rows, I used those two scalar values.

 SELECT
                y.i --MAX(y.i)
              FROM
                ( VALUES ( 1) ) AS y ( i ) 
  WHERE
                y.i = 1 ;
 SELECT
                y.i --MAX(y.i)
              FROM
                ( VALUES ( 1) ) AS y ( i ) 
  WHERE
                y.i = 2;

With these queries, I get one row and zero rows, just an empty set. That makes sense in terms of why removing the MAX() gives me one row in the whole thing.

Next I added the MAX back.

 SELECT
                MAX(y.i)
              FROM
                ( VALUES ( 1) ) AS y ( i ) 
  WHERE
                y.i = 1 ;
 SELECT
                MAX(y.i)
              FROM
                ( VALUES ( 1) ) AS y ( i ) 
  WHERE
                y.i = 2;

Now I get one row for the first, and one row for the second? Huh? However the second set is a row with NULL in it. I checked the EXISTS() documentation, and sure enough, if there are rows, this returns true, even if the row has a null value. This isn’t the value of a row, but rather just its presence.

I then did this to check:

SELECT 'test' = 1
 WHERE EXISTS( SELECT * FROM mytable WHERE 1 = 0);
SELECT 'test' = 1
 WHERE EXISTS( SELECT null);

Sure enough, the first gives me an empty result set, while the second doesn’t.

But why does MAX() return a row? I tried this with a simple query:

 SELECT MAX(i)
  FROM ( VALUES (1)) AS x(i)
  WHERE x.i = 2

Which does return NULL. I did search and saw this explanation on SO, saying that the result of the MAX() for the group (where x.i=2) is undefined, hence the NULL. This is born out as you see here:

CREATE TABLE mytable88(i INT);
GO
SELECT MAX(i) FROM dbo.mytable88 AS m;

Strange. Certainly I wouldn’t have expected that from MAX(). I would have thought it was an empty set, but apparently that’s not the case.

Filed under: Blog Tagged: syndicated, T-SQL

Rate

Share

Share

Rate

Puzzled by T-SQL

Rate

Share

Share

Rate

Related content

Book Review: Big Red - Voyage of a Trident Submarine

Database Mirroring FAQ: Can a 2008 SQL instance be used as the witness for a 2005 database mirroring setup?

Inserting Markup into a String with SQL

Networking - Part 4

Speaking at Community Events - More Thoughts