Outer Join Mystery

Question

Post reply

Outer Join Mystery

Viewing 15 posts - 16 through 30 (of 39 total)

You must be logged in to reply to this topic. Login to reply

Scott Strickland SSC Rookie Points: 43 More actions · Answer 1

Scott Strickland

SSC Rookie

Points: 43

May 15, 2006 at 12:25 pm

#637959

So, how does Oracle return the "right" results?

Scott Coleman One Orange Chip Points: 27452 More actions · Answer 2

The use of the old-style syntax promoted the entire WHERE clause to the ON clause. You get the same results as the "wrong" query if you used "LEFT JOIN ... ON S. = O. AND o.rats_filename IS NOT NULL". The IS NOT NULL condition is being used as a a join condition instead of a filter condition and, since none of the fields are null, has no effect.

The only mystery to me is how Oracle differentiates join conditions and filter conditions. Maybe only logical comparisons involving both tables are considered join conditions?

Grant Fritchey SSC Guru Points: 398690 More actions · Answer 3

It simply returns the anticipated results, which is only those rows that don't match. Precisely how... you need to ask someone that understands Oracle. I think Scott Coleman's answers have been more precise than anything I'm going to offer up.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Grant Fritchey SSC Guru Points: 398690 More actions · Answer 4

Thanks. That makes more sense than any other explanation I've been able to come up with.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Jeff B. SSC Veteran Points: 253 More actions · Answer 5

Itzik Ben-Gan wrote an article that explains this very clearly in SQL Magazine in the October 2004 issue. The InstantDoc ID is #43681.

The problem is that in Old Style Joins, you can't separate the join clause and the Where Clause so the result may include extraneous NULL values in your result set.

Here are two simple queries you can run in PUBS to see this:

-- Old Style Join

-- 13 Valid Entries, 6 Invalid Entries

select o.name, i.name,i.indid

from sysobjects o, sysindexes i

where o.id *= i.id and

o.type = 'S' and

i.indid = 1

-- New Style Join

-- 13 Valid Entries, 0 Invalid Entries

select o.name, i.name,i.indid

from sysobjects o

left outer join sysindexes i

on o.id = i.id

where o.type = 'S' and

i.indid = 1

Skip Hansen SSC Rookie Points: 49 More actions · Answer 6

Interesting article. I'm glad I always use the newer syntax. To the poster who said the problem was that the problem was not knowing which part of the where to apply, I would have to disagree. It is something more than that, although I don't know what. For example, I tried the following:

SELECT testnull =

CASE

WHEN tmp.Rats_filename IS NULL THEN 'NULL '

WHEN tmp.Rats_filename IS NOT NULL THEN 'NOT NULL'

ELSE 'OTHER'

END

, *

FROM

(SELECT *

FROM UUG_APP_DVLP.dbo.[IIAFeedTransaction] s

, UUG_APP_DVLP.dbo.[RATSIIAFeedTransaction] o

WHERE S.IIATransactionId *= substring(rats_filename,1+

patindex ('%{________-____-____-____-

____________}%',rats_filename),36)

)TMP

WHERE tmp.Rats_filename IS NOT NULL

ORDER BY 2

For the above, there should be no problem with what to join together. The second WHERE should work properly, and I would expect it to work the same as the CASE statement. But it doesn't!!!. See results below:

Testnull

IIATransactionId

RATS_FILENAME

DATE_ENTERED

NULL

9B33A776-408B-4928-AE2A-0FF43995DE12

1

NULL

8EEA51CE-D87A-4F76-B9C1-7AD11532D444

0

NULL

5D31FD56-7C0B-408B-83A5-864B7BD35ADC

1

NULL

NOT NULL

365A0FD8-5042-4297-A082-8F5B11450AF4

1

CISF-{260021}-{365A0FD8-5042-4297-A082-8F5B11450AF4}-{7A050246-59F3-4E72-BFD2-40EB183B7D3C}

2006-05-03 13:46:21.470

NOT NULL

16706611-C94D-4FBC-8F4E-9077C3B9E697

1

CISF-{260346}-{16706611-C94D-4FBC-8F4E-9077C3B9E697}-{73916523-EC71-4989-BE99-1EDB507D67DF}

2006-05-03 13:46:21.470

NOT NULL

55DD6703-9693-45E2-A339-987066EA2864

2

RA-1-{259590}-{55DD6703-9693-45E2-A339-987066EA2864}-{EF41CFCE-20BA-4D2D-A4C8-121BFC7DAEE3}

2006-05-03 13:46:21.470

NULL

A721BD38-9C76-4C3A-A7C5-9CC0227F90F4

0

NULL

NOT NULL

F4849137-6454-46E5-9811-A6985A189249

1

RA-2-{219960}-{F4849137-6454-46E5-9811-A6985A189249}-{F7267BD6-62CB-4379-8533-C413DD5EA402}

2006-05-03 13:46:21.470

Moral of this story is: Stay with the new syntax.

Scott Coleman One Orange Chip Points: 27452 More actions · Answer 7

You might think "there should be no problem with what to join together" because you put parentheses around the derived table, but the query optimizer has its own agenda sometimes. What did the execution plan look like? You have got to give the query compiler credit, it's not easy to fool it with an extra level of parenthesis.

A working hypothesis might be that in SQL Server once you use an old=style join then all WHERE clauses that refer to tables A or B (either singly or together) are interpreted as join conditions. I don't know this to be true, but it explains the examples you've presented. The difference in Oracle may be that it only uses conditions on both tables for joining, while conditions on only one table are used for filtering.

S P-328547 Newbie Points: 1 More actions · Answer 8

SELECT

*

FROM

(

SELECT

'Naughty Optimizer; Real developers use parenthesis to clarify default behavior AND SPECIFY REQUIRED BEHAVIOR.' AS 'My two cents',

'Using SELECT ColTitle = Expr is nice for swapping UPDATE for SELECT when Expr is a col name in UPDATE target, but this is legacy syntax' AS 'Farewell Old Friend'

WHERE

1 = 1

&nbsp AS Moral_of_Story__Use_The_New_Syntax

Ian Yates SSCoach Points: 19738 More actions · Answer 9

I've always used the newer syntax as it made things obvious and clear to me... I suppose if I had grown up using the old syntax I might think differently - but I also shudder to think of the difficulties I would've found trying to express what I needed in a query using the old syntax when the new syntax is just so easy and straightforward to both write, debug and explain to others. Yay for new syntax - or, as far as I was concerned, the only syntax anyway

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 10

As with most things, it's a matter of habit.

Things we know are 'easy', things we don't know are 'difficult'.

There's no rocketscience about that.

I started out writing T-SQL the old way, and switched to ANSI style sometime around y2k.

At that time ANSI was hard, difficult, weird, strange, awkward... take your pick.

Of course that was because in the beginning it was something that I didn't knew too well.

But with practice comes perfect, and once you get used to it there are no regrets.

ANSI is way more clear, concise and easy to write and moreover, to read. There's no doubt when

you read a query what the author intended with it when it was written (assuming it was written correctly ofc)

The major point is that there's no ambiguity for outer joins - a quite tricky area to 'get right'.

I should add as a final note a plug for aliasing. Always alias everything. ANSI outer joins with aliases on all columns and tables is pretty much self-documenting code.

/Kenneth

Grant Fritchey SSC Guru Points: 398690 More actions · Answer 11

Doggone it. I was looking through Itzik's stuff for answer. I guess I didn't dig deep enough. Thanks for posting that.

Itzik's one of the minor dieties around my shop ever since we spent a week with him doing advanced TSQL training.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Grant Fritchey SSC Guru Points: 398690 More actions · Answer 12

That's how I "solved" the problem when it was first presented to me. I couldn't read it in a way that made sense with the old syntax so I rewrote it and it was suddenly working.

My Oracle-centric coworker held a gun to my head until I tried it several times both ways. It was just a hoot watching the data changing based on syntax only.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Grant Fritchey SSC Guru Points: 398690 More actions · Answer 13

Amen on the aliasing brother.

We've been reviewing some code from developers that not only didn't use aliasing, but used table names instead. Unfortunately they didn't even do that consistently so we end up with

SELECT table.column, dbo.table.column, column

FROM...

We're getting out the hickory stick for the next meeting with this dev team...

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Vinay Singhania Newbie Points: 1 More actions · Answer 14

Hi,

I had the same issue couple of days back and I solved it by same way (removing and clause and putting where clause).

VS

Dennis D. Allen Say Hey Kid Points: 699 More actions · Answer 15

There was a post that hit the nail on the head. Your where clause that stipulates that RAT_FILENAME IS NOT NULL does not limit results due to there being no NULL RAT_FILENAME columns... You want to look for NULLs after the join is complete.

More to the point would be something like;

/* return all IIAFeedTransaction.IIATransactionId values that are

* not used in the RATSIIAFeedTransaction.RAT_FILENAME values

*/

SELECT

IIATransactionId

FROM

dbo.IIAFeedTransaction

WHERE

IIATransactionId IN (

SELECT SubString(RATS_FILENAME,1+PatIndex('%{________-____-____-____-____________}%',RATS_FILENAME),36)

FROM dbo.RATSIIAFeedTransaction

)