Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

ANSI Joins

By Neil Boyle,

Introduction

Are you confused by all this "inner join" style syntax that is becoming more and more prominent lately?

It’s happening because Microsoft is moving towards using ANSI syntax. The old syntax still works, but it is strongly recommended you learn the new syntax, Microsoft have stated that they will not support the old syntax indefinitely.

It’s fairly easy to translate from the old to the new syntax, and correctly translated results will usually produce identical results, but there are exceptions that you should be aware of.

Inner or natural joins

Inner joins return rows where data matching exists in the tables you are joining. This is the simplest type of join, and moving between the old and new syntax should present no problems. In general, this:

select o.name, i.name
from sysobjects o, sysindexes i
where o.id = i.id

Is equivalent to this:

select o.name, i.name
from sysobjects o inner join sysindexes i
on o.id = i.id

Note that the components of the where clause that define the joins between tables are now defined in the from clause instead. The where is now only used for specifying selection criteria. To illustrate this:

select o.name, i.name
from sysobjects o, sysindexes i
where o.id = i.id and
indid = 1

Becomes:

select o.name, i.name
from sysobjects o inner join sysindexes i on o.id = i.id
where indid = 1

Incidentally, what the query above does is to display table names and index names in the current database where the table has a Clustered Index. The sysindexes.indid column is always set to one for clustered indexes.

I will use similar examples later on, and if you are not too clear on these system tables, then now would be a good time for you to review the documentation on sysobjects and sysindexes in Books Online.

Update and Delete

These statements also support the ANSI join syntax, although it’s not very well documented in Books Online for some versions of SQL Server

update t1
set t1.col = …………from t1 inner join t2 on t1.col = t2.col

delete t1
from t1 inner join t2 on t1.col = t2.col

Outer joins

When two tables are joined with an inner join, data will only be returned if matching data exists in both tables. An outer join is like saying "and also include the rows from one table if there are no matching rows in the other one."

With an outer join the columns from the table where data is "missing" are returned as NULL values.

Outer joins come in two basic flavours, called Left and Right. Left outer joins mean that the data must be contained in the table defined to the left side of the equivalence, but not necessarily the right hand side. Right outer joins, of course, work the other way around.

To illustrate this, cut and paste the code below into a Query Analyser window and try running it. I have used the newer ANSI syntax here, and the older equivalents are included but commented out using the "—" comment notation. Comment them back in if you want to try them.

set nocount on
go
create table #left (leftI int)
create table #right (rightI int)

insert #left select 1 union select 2 union select 3
insert #right select 2 union select 3 union select 4 union select 5 union select 6

print "*** Inner join ***"
select * from #left inner join #right on leftI = rightI
-- select * from #left, #right where leftI = rightI

print " "
print "*** left join ***"
select * from #left left outer join #right on leftI = rightI
-- select * from #left, #right where leftI *= rightI

print " "
print "*** right join ***"
select * from #left right outer join #right on leftI = rightI
-- select * from #left, #right where leftI =* rightI

drop table #left
drop table #right
set nocount off
go

Outer Join Gotchas

Unlike natural joins, outer joins have a few traps you can fall into if you translate your query from the old syntax to the new and expect it to produce the same results in both formats.

Returning to our sysobjects and sysindexes examples, the following two queries should produce the same results

select o.name, i.name
from sysobjects o, sysindexes i
where o.id *= i.id and
o.type = "S"
select o.name, i.name
from sysobjects o left outer join sysindexes i on o.id = i.id
where o.type = "S"

However, if we qualify the select criteria on sysindexes as follows, in order to look for details on clustered indexes, the two queries should return different results.

select o.name, i.name
from sysobjects o, sysindexes i
where o.id *= i.id and
o.type = "S" and
i.indid = 1

select o.name, i.name
from sysobjects o left outer join sysindexes i on o.id = i.id
where o.type = "S" and
i.indid = 1

The reason this happens is that the SQL Server reacts differently to the "Where" clause when the outer joined table does not have any data.

When no data is found in the sysindexes table that matches the sysobjects table, indid will be returned as NULL. In this case:

  • With the old syntax, SQL Server will ignore the i.indid = 1 selection criteria and return a row in which indid IS NULL, even though, strictly speaking, the where i.indid =1 clause has failed
  • With the new syntax SQL Server takes the view that the test i.indid = 1 fails because i.inid is NULL, so no value is returned.

Full outer joins

Full outer joins effectively combine the left and right outer joins so that data will be returned if it matches in both tables, or if it exist in either one.

The old join syntax has no direct equivalent of the full outer join.

Try out the SQL below to illustrate the full outer join.

set nocount on
go
create table #left (leftI int)
create table #right (rightI int)
insert #left select 1 union select 2 union select 3
insert #right select 2 union select 3 union select 4 union select 5 union select 6

print "*** FULL join ***"
select * from #left full outer join #right on leftI = rightI

drop table #left
drop table #right
set nocount off
go

Cross Joins

Cross Joins create a Cartesian Product, rather like when you forget to include a "where" clause to join two tables. There are not many cases where you would want to use a cross join, but this article discusses one possible use for them. 

Orphan hunting

Prior to ANSI SQL, if you wanted to find records in one table that did not match a record in another, the recommended solution was to use a "NOT IN" or "NOT EXISTS" based query. It is now recommended that you use outer joins instead. Here is a simple example using all three possibilities – they should all produce the same results:

set nocount on
go
create table #left (leftI int)
create table #right (rightI int)
insert #left select 1 union select 2 union select 3
insert #right select 2 union select 3 union select 4 union select 5 union select 6

select *
from #right
where rightI not in (
    select leftI from #left
)

select *
from #right
where not exists (
    select * from #left where leftI = rightI
)

select #right.*
from #left right outer join #right on leftI = rightI
where leftI IS NULL

drop table #left
drop table #right
set nocount off
go

Beware of falling into the trap we discussed earlier. Looking at our previous "Clustered Index" example, either of these two following old-style queries would produce a list of system tables that do not have clustered indices:

select *
from sysobjects o
where type = "S" and
id not in (
    select id
    from sysindexes
    where indid = 1
)

select *
from sysobjects o
where type = "S" and
not exists (
    select id
    from sysindexes i
    where indid = 1 and
    o.id = i.id
)

Here is a simple translation of the above queries into the new "left outer join" style. Try it out and see what happens.

select *
from sysobjects o left outer join sysindexes i on o.id = i.id
where type = "S" and
i.indid = 1 AND i.id IS NULL

No rows are returned.

The query now fails because the where i.indid = 1 clause makes no sense when we are looking for rows that do not exist in sysindexes.

To get around this, the following example uses a "derived query" which forces a select on sysindexes to return a subset of rows with indid =1, which is subsequently used in the outer join.

select *
from sysobjects o left outer join
(select * from sysindexes i where indid = 1) as i on o.id = i.id
where type = "S" and
i.id IS NULL

In effect, the derived query -

(select * from sysindexes i where indid = 1) as i

creates a notional table called "i", which contains a copy of all the sysindexes rows with an indid of 1. It is this notional table that is then fed to the rest of the query. The final results should now match the NOT EXISTS and NOT IN examples shown above.

About the author

Neil Boyle is an independant SQL Server consultant working out of London, England. Neil's free SQL Server guide is available on-line at http://www.impetus-sql.co.uk

Total article views: 7784 | Views in the last 30 days: 6
 
Related Articles
FORUM

Noncluster at sysindexes

Noncluster at sysindexes

FORUM

object_name(id) returning Null

sysindexes

FORUM

Errors on SYSINDEXES

SQL2000, errors 8929,8928,8939,8965,8964 on SYSINDEXES

FORUM

Outer Join

Not all records selected from a left outer join

FORUM

sysindexes

When a server running sql 7.0 tries to get information from the sysindexes table of another server ...

Tags
t-sql    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones