Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


SQL & the JOIN Operator


SQL & the JOIN Operator

Author
Message
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45450 Visits: 39946
Chris.Strolia-Davis (10/7/2009)
Excellent article.

You mentioned not knowing a real world application of the CROSS JOIN.

In my experience, this is typically used for creating test data.

Sometimes you need to test data in all sorts of different configurations. By using a cross join, you can set up the different parameters and try all combinations.

Additionally, if you are trying to create bogus data for a test environment, this is one way of taking data from different parts of the real data and generating new data that is not actually real.

In many cases, this type of join is used on temporary or memory based tables in a batch since the data it produces often needs to go through additional transformation and filtering before it is useful.

I guess, technically, it isn't used in a "real world" application, but it is used for real world issues.

The real world applications for using CROSS JOINS are many and varied. Most of them revolve around the use of a Tally Table (Numbers Table) to do things like make a Tally CTE which in turn would be cross joined to a delimited column to do splits or used to generate contiguous dates, etc. When limited by Triangular self joins (about half a cross join but still uses CROSS JOIN), they can be used to generate "schedule pairs" and a whole lot more. And, you're also correct... they can be used to very quickly generate very large volumes of constrained randomized test data. It's not uncommon to see some of the frequent posters generate a million row test table to make their point about a performance problem/solution. Rog_os also pointed out a frequent use above.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Charles Kincaid
Charles Kincaid
Ten Centuries
Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)

Group: General Forum Members
Points: 1049 Visits: 2383
Great article. Very good primer on the subject. When you do your next one on advanced joins please talk about moving thing from the WHERE to the ON for better performance. Case in point I needed a list of all CUSTOMERs who had 'CREDIT' type ORDERs placed in the last 30 days. So instead of:
WHERE [ORDER].TypeId = 3 AND ...


use
INNER JOIN [ORDER] ON [ORDER].CustomerID = [CUSTOMER].Id AND [ORDER].TypeId = 3



ATBCharles Kincaid
Jeff Moden
Jeff Moden
SSC-Forever
SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)SSC-Forever (45K reputation)

Group: General Forum Members
Points: 45450 Visits: 39946
Nice article, Wagner! Articles of this nature should be required reading for anyone just starting out in SQL and those that enjoy a refresher. Well done. In the vein of "One picture is worth a thousand words", you did a great job with the graphics. Thanks for taking the time.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
Although they tell us that they want it real bad, our primary goal is to ensure that we dont actually give it to them that way.
Although change is inevitable, change for the better is not.
Just because you can do something in PowerShell, doesnt mean you should. Wink

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
alen teplitsky
alen teplitsky
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1597 Visits: 4621
is there any performance difference doing a union all compared to a join when you need to get all the rows from two or more tables? i have a database where i do union all on some data in 20 tables or so when running a report and it seems to take a long time
Tobar
Tobar
SSC Veteran
SSC Veteran (269 reputation)SSC Veteran (269 reputation)SSC Veteran (269 reputation)SSC Veteran (269 reputation)SSC Veteran (269 reputation)SSC Veteran (269 reputation)SSC Veteran (269 reputation)SSC Veteran (269 reputation)

Group: General Forum Members
Points: 269 Visits: 758
We use "cross joins", actually no join at all, when we are creating our dimensions in our data warehouse.

<><
Livin' down on the cube farm. Left, left, then a right.
David Walker-278941
David Walker-278941
SSC Journeyman
SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)SSC Journeyman (91 reputation)

Group: General Forum Members
Points: 91 Visits: 231
Yet another article that makes people think that table aliases (t1 and t2) are part of the required syntax. This first Join statement:

SELECT t1.key1, t1.field1 as Name, t1.key2 as T1Key,
t2.key2 as T2Key, t2.field1 as City
FROM Table1 t1
INNER JOIN Table2 t2 ON t1.key2 = t2.key2

Should be written like this:

SELECT Table1.key1, Table1.field1 as Name, Table1.key2 as T1Key,
Table2.key2 as T2Key, Table2.field1 as City
FROM Table1
INNER JOIN Table2 ON Table1.key2 = Table2.key2

Isn't that easier to read?

In this simple example, why in the WORLD are you including table aliases for Table1 and Table2? They are COMPLETELY unneccessary. When new users are exposed to table aliases in this kind of setting, they naturally think the table aliases are required.

It gets more important when the tables are named ClientStatus and ClientOrders and people start aliasing them as "cs" and "co", or even worse, as "a" and "b".

PLEASE don't teach that table aliases are a required part of the syntax. When they are required, they are required, but not in this simple example, and here they add nothing -- and they DETRACT from human readability -- in such simple examples. The human mind has to keep track of which alias goes with which table, and it gets hard when there are 5 or 6 tables involved.

It's also good to see a <= join here. That's often left out of examples. Don't forget to also teach that you can join tables on two conditions (using AND/OR), not just one.

Thanks.

David Walker
alen teplitsky
alen teplitsky
SSCommitted
SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)SSCommitted (1.6K reputation)

Group: General Forum Members
Points: 1597 Visits: 4621
David Walker-278941 (10/7/2009)
Yet another article that makes people think that table aliases (t1 and t2) are part of the required syntax. This first Join statement:

SELECT t1.key1, t1.field1 as Name, t1.key2 as T1Key,
t2.key2 as T2Key, t2.field1 as City
FROM Table1 t1
INNER JOIN Table2 t2 ON t1.key2 = t2.key2

Should be written like this:

SELECT Table1.key1, Table1.field1 as Name, Table1.key2 as T1Key,
Table2.key2 as T2Key, Table2.field1 as City
FROM Table1
INNER JOIN Table2 ON Table1.key2 = Table2.key2

Isn't that easier to read?

In this simple example, why in the WORLD are you including table aliases for Table1 and Table2? They are COMPLETELY unneccessary. When new users are exposed to table aliases in this kind of setting, they naturally think the table aliases are required.

It gets more important when the tables are named ClientStatus and ClientOrders and people start aliasing them as "cs" and "co", or even worse, as "a" and "b".

PLEASE don't teach that table aliases are a required part of the syntax. When they are required, they are required, but not in this simple example, and here they add nothing -- and they DETRACT from human readability -- in such simple examples. The human mind has to keep track of which alias goes with which table, and it gets hard when there are 5 or 6 tables involved.

It's also good to see a <= join here. That's often left out of examples. Don't forget to also teach that you can join tables on two conditions (using AND/OR), not just one.

Thanks.

David Walker


i have seen the same thing

the rule is you write code to look cool, hardware is cheap
davidpenton
davidpenton
Grasshopper
Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)Grasshopper (13 reputation)

Group: General Forum Members
Points: 13 Visits: 28
I created this article/test page a number of years ago that describes Inner, Left, Right, Full, Cross, Triple, Self, Union and Union All queries. Joins



Chris.Strolia-Davis
Chris.Strolia-Davis
SSC Rookie
SSC Rookie (45 reputation)SSC Rookie (45 reputation)SSC Rookie (45 reputation)SSC Rookie (45 reputation)SSC Rookie (45 reputation)SSC Rookie (45 reputation)SSC Rookie (45 reputation)SSC Rookie (45 reputation)

Group: General Forum Members
Points: 45 Visits: 83
SQL Noob (10/7/2009)
is there any performance difference doing a union all compared to a join when you need to get all the rows from two or more tables? i have a database where i do union all on some data in 20 tables or so when running a report and it seems to take a long time

I'm not certain I understand your question.

A UNION ALL combines the data from two similar sets into a single set with all members of each set.

A JOIN matches the data in one set with the data in another set.

Based on my current knowledge of SQL, I believe it might be theoretically possible to replicate a UNION ALL in a JOIN (although even in theory I think it would be dependent on your table structure), but it would be atypical and in all likelihood it would take longer to run than a straight "union all".

I have not tested this theory.

I'll admit, even as I attempt to consider an alternative that involves a JOIN, I still envision needing a UNION ALL in the query and a requirement for each record to have a unique id across all tables.
SanjayAttray
SanjayAttray
Hall of Fame
Hall of Fame (4K reputation)Hall of Fame (4K reputation)Hall of Fame (4K reputation)Hall of Fame (4K reputation)Hall of Fame (4K reputation)Hall of Fame (4K reputation)Hall of Fame (4K reputation)Hall of Fame (4K reputation)

Group: General Forum Members
Points: 3953 Visits: 1619
Excellent article Wagner! A good revision of good SQL and old but basic stuff for any developer to be called a good developer. Thanks again.

SQL DBA.
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search