SQLServerCentral Article

ANSI Joins

,

Introduction

Are you confused by all this "inner join" style

syntax that is becoming more and more prominent lately?

It’s happening because Microsoft is moving towards using

ANSI syntax. The old syntax still works, but it is strongly recommended you

learn the new syntax, Microsoft have stated that they will not support the old

syntax indefinitely.

It’s fairly easy to translate from the old to the new

syntax, and correctly translated results will usually produce identical results,

but there are exceptions that you should be aware of.

Inner or natural joins

Inner joins return rows where data matching exists in the

tables you are joining. This is the simplest type of join, and moving between

the old and new syntax should present no problems. In general, this:

select o.name, i.name

from sysobjects o, sysindexes i

where o.id = i.id

Is equivalent to this:

select o.name, i.name

from sysobjects o inner join sysindexes i

on o.id = i.id

Note that the components of the where clause that

define the joins between tables are now defined in the from clause

instead. The where is now only used for specifying selection criteria. To

illustrate this:

select o.name, i.name

from sysobjects o, sysindexes i

where o.id = i.id and

indid = 1

Becomes:

select o.name, i.name

from sysobjects o inner join sysindexes i on o.id = i.id

where indid = 1

Incidentally, what the query above does is to display table

names and index names in the current database where the table has a Clustered

Index. The sysindexes.indid column is always set to one for clustered indexes.

I will use similar examples later on, and if you are not too

clear on these system tables, then now would be a good time for you to review

the documentation on sysobjects and sysindexes in Books Online.

Update and Delete

These statements also support the ANSI join syntax, although

it’s not very well documented in Books Online for some versions of SQL Server

update t1

set t1.col = …………from t1 inner join t2 on t1.col = t2.col

delete t1

from t1 inner join t2 on t1.col = t2.col

Outer joins

When two tables are joined with an inner join, data will only

be returned if matching data exists in both tables. An outer join is like saying

"and also include the rows from one table if there are no matching rows in

the other one."

With an outer join the columns from the table where data is

"missing" are returned as NULL values.

Outer joins come in two basic flavours, called Left and Right.

Left outer joins mean that the data must be contained in the table defined to

the left side of the equivalence, but not necessarily the right hand side. Right

outer joins, of course, work the other way around.

To illustrate this, cut and paste the code below into a Query

Analyser window and try running it. I have used the newer ANSI syntax here, and

the older equivalents are included but commented out using the "—"

comment notation. Comment them back in if you want to try them.

set nocount on

go

create table #left (leftI int)

create table #right (rightI int)

insert #left select 1 union select 2 union select 3

insert #right select 2 union select 3 union select 4 union select 5 union select 6

print "*** Inner join ***"

select * from #left inner join #right on leftI = rightI

-- select * from #left, #right where leftI = rightI

print " "

print "*** left join ***"

select * from #left left outer join #right on leftI = rightI

-- select * from #left, #right where leftI *= rightI

print " "

print "*** right join ***"

select * from #left right outer join #right on leftI = rightI

-- select * from #left, #right where leftI =* rightI

drop table #left

drop table #right

set nocount off

go

Outer Join Gotchas

Unlike natural joins, outer joins have a few traps you can

fall into if you translate your query from the old syntax to the new and expect

it to produce the same results in both formats.

Returning to our sysobjects and sysindexes examples, the

following two queries should produce the same results

select o.name, i.name

from sysobjects o, sysindexes i

where o.id *= i.id and

o.type = "S"

select o.name, i.name

from sysobjects o left outer join sysindexes i on o.id = i.id

where o.type = "S"

However, if we qualify the select criteria on sysindexes as

follows, in order to look for details on clustered indexes, the two queries

should return different results.

select o.name, i.name

from sysobjects o, sysindexes i

where o.id *= i.id and

o.type = "S" and

i.indid = 1

select o.name, i.name

from sysobjects o left outer join sysindexes i on o.id = i.id

where o.type = "S" and

i.indid = 1

The reason this happens is that the SQL Server reacts

differently to the "Where" clause when the outer joined table does not

have any data.

When no data is found in the sysindexes table that matches the

sysobjects table, indid will be returned as NULL. In this case:

  • With the old syntax, SQL Server will ignore the i.indid

    = 1 selection criteria and return a row in which indid IS NULL,

    even though, strictly speaking, the where i.indid =1 clause has

    failed

  • With the new syntax SQL Server takes the view that the test

    i.indid = 1 fails because i.inid is NULL, so no value is returned.

Full outer joins

Full outer joins effectively combine the left and right outer

joins so that data will be returned if it matches in both tables, or if it exist

in either one.

The old join syntax has no direct equivalent of the full outer

join.

Try out the SQL below to illustrate the full outer join.

set nocount on

go

create table #left (leftI int)

create table #right (rightI int)

insert #left select 1 union select 2 union select 3

insert #right select 2 union select 3 union select 4 union select 5 union select 6

print "*** FULL join ***"

select * from #left full outer join #right on leftI = rightI

drop table #left

drop table #right

set nocount off

go

Cross Joins

Cross Joins create a Cartesian Product, rather like when you forget to

include a "where" clause to join two tables. There are not many cases

where you would want to use a cross join, but this article

discusses one possible use for them. 

Orphan hunting

Prior to ANSI SQL, if you wanted to find records in one table

that did not match a record in another, the recommended solution was to use a

"NOT IN" or "NOT EXISTS" based query. It is now recommended

that you use outer joins instead. Here is a simple example using all three

possibilities – they should all produce the same results:

set nocount on

go

create table #left (leftI int)

create table #right (rightI int)

insert #left select 1 union select 2 union select 3

insert #right select 2 union select 3 union select 4 union select 5 union select 6

select *

from #right

where rightI not in (

    select leftI from #left

)

select *

from #right

where not exists (

    select * from #left where leftI = rightI

)

select #right.*

from #left right outer join #right on leftI = rightI

where leftI IS NULL

drop table #left

drop table #right

set nocount off

go

Beware of falling into the trap we discussed earlier. Looking

at our previous "Clustered Index" example, either of these two

following old-style queries would produce a list of system tables that do not

have clustered indices:

select *

from sysobjects o

where type = "S" and

id not in (

    select id

    from sysindexes

    where indid = 1

)

select *

from sysobjects o

where type = "S" and

not exists (

    select id

    from sysindexes i

    where indid = 1 and

    o.id = i.id

)

Here is a simple translation of the above queries into the new

"left outer join" style. Try it out and see what happens.

select *

from sysobjects o left outer join sysindexes i on o.id = i.id

where type = "S" and

i.indid = 1 AND i.id IS NULL

No rows are returned.

The query now fails because the where i.indid = 1

clause makes no sense when we are looking for rows that do not exist in

sysindexes.

To get around this, the following example uses a "derived

query" which forces a select on sysindexes to return a subset of rows with indid

=1, which is subsequently used in the outer join.

select *

from sysobjects o left outer join

(select * from sysindexes i where indid = 1) as i on o.id = i.id

where type = "S" and

i.id IS NULL

In effect, the derived query -

(select * from sysindexes i where indid =

1) as i

creates a notional table called "i", which contains

a copy of all the sysindexes rows with an indid of 1. It is this notional table

that is then fed to the rest of the query. The final results should now match

the NOT EXISTS and NOT IN examples shown above.

About the author

Neil Boyle is an independant SQL Server consultant working out of London, England. Neil's free SQL Server guide is available on-line at http://www.impetus-sql.co.uk

Rate

5 (3)

You rated this post out of 5. Change rating

Share

Share

Rate

5 (3)

You rated this post out of 5. Change rating