Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


Fun with Outer Joins


Fun with Outer Joins

Author
Message
Kenneth.Fisher
Kenneth.Fisher
Hall of Fame
Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)

Group: General Forum Members
Points: 3600 Visits: 2022
vsarade (9/18/2012)
When executing the query with adding the FULL JOIN with where clause like below -

SELECT Professor.Id AS [Professor.Id], Professor.ProfessorName,
CASE Professor.HasTenure WHEN 1 THEN 'True' WHEN 0 THEN 'False' ELSE NULL END AS [Has Tenure],
Class.ProfessorId AS [Class.ProfessorId], Class.ClassName,
Class.ClassYear, Class.ClassSemester
FROM Professor
FULL JOIN Class
ON Professor.Id = Class.ProfessorId AND Class.ClassYear >= 2011
WHERE Professor.HasTenure = 'True'

Then it is giving same results as adding LEFT OUTER JOIN.


Not really surprising. This is for the same reason that putting the "Class.ClassYear = 2011" condition into the where clause eliminates the effect of the LEFT OUTER JOIN and makes it act like an INNER JOIN. Because the condition "Professor.HasTenure = 'True'" is in the WHERE clause it will eliminate all of the entries where HasTenure is NULL. In other words it is eliminating all of the entries that the difference between FULL and OUTER join caused.

Kenneth Fisher
I strive to live in a world where a chicken can cross the road without being questioned about its motives.
--------------------------------------------------------------------------------
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Link to my Blog Post --> www.SQLStudies.com
Charles Kincaid
Charles Kincaid
Ten Centuries
Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)Ten Centuries (1K reputation)

Group: General Forum Members
Points: 1049 Visits: 2383
DB.Duck (9/17/2012)
CELKO (9/9/2012)
...we do not use BIT flags in SQL...

You, sir, couldn't be more wrong. That is all.


Actually it's almost a great point but poorly stated. Let me give this a shot. For the most part you should avoid bit flags.

Often they are used to replicate data. Case in point is the [TerminationDate] of an employee. That column will be NULL for everyone who is still working and have a valid date for folks who have, or were, terminated. Therefore a [Terminated] bit column is not needed.

Then there is the whole aspect of indexing on bits. SQL Server is a quite fine product (all software has bugs so don't start) but even the greatest software has to make compromises. Indexing on a bit suffers. I'll leave further reading to you. There are many good articles and books.

Bits are tempting. Recent versions will cram 8 bits into one byte of actual storage. Sounds cool. You create the first one at design time. You are sucked into a false sense as adding the next seven are very fast (only the meta data is changed). Now you add the ninth and wait as every page in your table is rewritten.

ATBCharles Kincaid
Kenneth.Fisher
Kenneth.Fisher
Hall of Fame
Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)Hall of Fame (3.6K reputation)

Group: General Forum Members
Points: 3600 Visits: 2022
Charles Kincaid (9/18/2012)
DB.Duck (9/17/2012)
CELKO (9/9/2012)
...we do not use BIT flags in SQL...

You, sir, couldn't be more wrong. That is all.


Actually it's almost a great point but poorly stated. Let me give this a shot. For the most part you should avoid bit flags.

Often they are used to replicate data. Case in point is the [TerminationDate] of an employee. That column will be NULL for everyone who is still working and have a valid date for folks who have, or were, terminated. Therefore a [Terminated] bit column is not needed.

Then there is the whole aspect of indexing on bits. SQL Server is a quite fine product (all software has bugs so don't start) but even the greatest software has to make compromises. Indexing on a bit suffers. I'll leave further reading to you. There are many good articles and books.

Bits are tempting. Recent versions will cram 8 bits into one byte of actual storage. Sounds cool. You create the first one at design time. You are sucked into a false sense as adding the next seven are very fast (only the meta data is changed). Now you add the ninth and wait as every page in your table is rewritten.


Ok so I'm willing to accept that bit flags should be used when appropriate. For example when you want to know if a professor has tenure or not, but don't care when they got it. At that point putting in a date field is wasteful. Or possibly a better example would be if you are storing the results of a questionnaire and have a number of true/false or yes/no questions. It's really all about what your data requirements are.

I haven't read about the indexing problem of bit's but I certainly will at this point. Either way though I don't imagine it being all that helpful as the cardinality of a bit is bound to be terrible.

As far as that 9th bit column. I could be wrong (and that happens far more frequently than I would like) but wouldn't the 9th bit column (adding 1 byte to the row) be the same effect as adding a char(1), tinyint, or really any data type?

Kenneth Fisher
I strive to live in a world where a chicken can cross the road without being questioned about its motives.
--------------------------------------------------------------------------------
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Link to my Blog Post --> www.SQLStudies.com
marlon.seton
marlon.seton
SSC Eights!
SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)

Group: General Forum Members
Points: 845 Visits: 319
thisisfutile (9/11/2012)


Like in a car, we don't reference left and right sides because that's relevant to whether or not I'm looking at it from the back or from the front. Of course, referencing driver-side and passenger-side is different based on the country you live in. Now that I think about it, like boats, I'm all for using port and starboard for cars. :-P


We use 'near side' for the side nearest the kerb and 'off side' for the other side. Then it doesn't matter which country you are in or where the driver and front seat passenger are in the car. True, this doesn't work for off-road driving but that's something I have no experience of so I'm not bothered.
marlon.seton
marlon.seton
SSC Eights!
SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)SSC Eights! (845 reputation)

Group: General Forum Members
Points: 845 Visits: 319
Often they are used to replicate data. Case in point is the [TerminationDate] of an employee. That column will be NULL for everyone who is still working and have a valid date for folks who have, or were, terminated. Therefore a [Terminated] bit column is not needed.


I would regard this as overloading the [TerminationDate] column by giving it two meanings (terminated yes or no and date thereof). In addition to a column on the main table that states the current status, I would generally have a table with a foreign key of the main table and two columns of 'status' and 'date/time of status' (maybe others such as 'who set the status', etc.) so I could view all changes of status in chronological order, thus allowing for a status going to one value and then to another. For the employee, I could then see the dates of actions such as 'hired', 'suspended', 'terminated', 'rehired'. I can add new status by allowing new values in the 'status' column on the subsidiary table rather than having to add new columns for each new status to the main table.
Steve Rosenbach
Steve Rosenbach
SSC-Enthusiastic
SSC-Enthusiastic (103 reputation)SSC-Enthusiastic (103 reputation)SSC-Enthusiastic (103 reputation)SSC-Enthusiastic (103 reputation)SSC-Enthusiastic (103 reputation)SSC-Enthusiastic (103 reputation)SSC-Enthusiastic (103 reputation)SSC-Enthusiastic (103 reputation)

Group: General Forum Members
Points: 103 Visits: 206
Preserved and Unpreserved:

I often use the terms "preserved" and "unpreserved" table. I think it's from some ORACLE documentation that I read about 15+ years ago, but I find it useful.


Bit Columns:

In designing a truly normalized schema, I would do my best to avoid having a bit column in a table. Joe's example of Termination_Date versus Terminated_Y_N is a good one. It's a more "correct" way of getting to a fully normalized solution.

Also, you can create an index on a Bit column, but it would be pretty much useless, as its not at all selective - the optimizer will probably never use it.

Having said all that, I still liked the article - you got across the main point, and I liked the way you used the FULL JOIN to show the pattern and illustrate the principles involved.



chiesa.alberto
chiesa.alberto
SSC Rookie
SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)SSC Rookie (27 reputation)

Group: General Forum Members
Points: 27 Visits: 169
It seems to me that a lot of people is not getting really HOW a query is executed.
But that should be querying 101.

I would just point out the T-SQL Querying book by Itzik Ben-Gan, but let me recap.

A T-SQL query is composed by clauses, those are:

SELECT TOP(x) ....
FROM ...
WHERE ...
GROUP BY ....
ORDER BY ...

But this is a description of the query, not the real execution order.

If we where writing queries with clauses in the exact execution order, it would be in this way:

FROM ...
WHERE ...
GROUP BY ...
SELECT ...
ORDER BY ...
TOP ...

This is why you can write "Order by 1" and have the data sorted by the first column: because when the order by gets executed, the selected data is already there. However, you cannot use the expressions in the SELECT clause in the WHERE, FROM or GROUP BY clauses, because when those are executed, the select has still to be materialized.

What seems to trouble a lot of people is failing to grasp that the ON clause and the WHERE clause are executed in different moments. And, in between them, the OUTER kicks in.

So, the ON is used to match data for the join, and AFTER the join is evaluated, 1 of 4 things happens:
- if the join is INNER, nothing happens and no data is appended. This is why an INNER join will be always at least as fast as an outer
- if the join is LEFT, RIGHT or FULL OUTER, the non-joining data from the related table(s) will be added to the result set.

After this result set given by the FROM clause is built, the WHERE kicks in. No magic or difficult considerations.

Just always, ALWAYS remember the order of execution.

Everything else in the article is, IMHO, unnecessary clutter.
davoscollective
davoscollective
SSChasing Mays
SSChasing Mays (609 reputation)SSChasing Mays (609 reputation)SSChasing Mays (609 reputation)SSChasing Mays (609 reputation)SSChasing Mays (609 reputation)SSChasing Mays (609 reputation)SSChasing Mays (609 reputation)SSChasing Mays (609 reputation)

Group: General Forum Members
Points: 609 Visits: 1000
"Mr Pepper" ? Surely that's the only real Doctor present.

I found this thread both amusing and enlightening on many levels.
c_harnett
c_harnett
Forum Newbie
Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)Forum Newbie (3 reputation)

Group: General Forum Members
Points: 3 Visits: 5
I still get around 1 or 2 questions a week on this subject. Which is why I wrote the article in the first place.


To my mind, that's an excellent reason for writing the article. Many people using SQL are not expert and an article like this helps them learn.
deroby
deroby
Valued Member
Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)Valued Member (56 reputation)

Group: General Forum Members
Points: 56 Visits: 288
@Kenneth Fisher: Thx for the article, it's most certainly something that gets a lot of people confused and the given examples are a great help to show what's going on.
On top of that, the part

“Why is that! “ I hear some of you cry. The rest are split between 75% who already know the answer, 15% who don’t care, and the remaining 10% are still trying to figure out what an OUTER JOIN is and why I’m going on about it.

will have me smiling for the rest of the day =)

@Celko: Although I understand the need to 'preach against bad design', IMHO you're picking the wrong fight here and most certainly could have used more respectful wording.



Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search