Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««23456»»»

Fun with Outer Joins Expand / Collapse
Author
Message
Posted Tuesday, September 18, 2012 7:43 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 12:36 PM
Points: 3,466, Visits: 1,813
vsarade (9/18/2012)
When executing the query with adding the FULL JOIN with where clause like below -

SELECT Professor.Id AS [Professor.Id], Professor.ProfessorName,
CASE Professor.HasTenure WHEN 1 THEN 'True' WHEN 0 THEN 'False' ELSE NULL END AS [Has Tenure],
Class.ProfessorId AS [Class.ProfessorId], Class.ClassName,
Class.ClassYear, Class.ClassSemester
FROM Professor
FULL JOIN Class
ON Professor.Id = Class.ProfessorId AND Class.ClassYear >= 2011
WHERE Professor.HasTenure = 'True'

Then it is giving same results as adding LEFT OUTER JOIN.


Not really surprising. This is for the same reason that putting the "Class.ClassYear = 2011" condition into the where clause eliminates the effect of the LEFT OUTER JOIN and makes it act like an INNER JOIN. Because the condition "Professor.HasTenure = 'True'" is in the WHERE clause it will eliminate all of the entries where HasTenure is NULL. In other words it is eliminating all of the entries that the difference between FULL and OUTER join caused.


Kenneth Fisher
I strive to live in a world where a chicken can cross the road without being questioned about its motives.
--------------------------------------------------------------------------------
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Link to my Blog Post --> www.SQLStudies.com
Post #1360791
Posted Tuesday, September 18, 2012 7:58 AM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Thursday, February 6, 2014 12:59 PM
Points: 801, Visits: 1,962
DB.Duck (9/17/2012)
CELKO (9/9/2012)
...we do not use BIT flags in SQL...

You, sir, couldn't be more wrong. That is all.


Actually it's almost a great point but poorly stated. Let me give this a shot. For the most part you should avoid bit flags.

Often they are used to replicate data. Case in point is the [TerminationDate] of an employee. That column will be NULL for everyone who is still working and have a valid date for folks who have, or were, terminated. Therefore a [Terminated] bit column is not needed.

Then there is the whole aspect of indexing on bits. SQL Server is a quite fine product (all software has bugs so don't start) but even the greatest software has to make compromises. Indexing on a bit suffers. I'll leave further reading to you. There are many good articles and books.

Bits are tempting. Recent versions will cram 8 bits into one byte of actual storage. Sounds cool. You create the first one at design time. You are sucked into a false sense as adding the next seven are very fast (only the meta data is changed). Now you add the ninth and wait as every page in your table is rewritten.


ATB

Charles Kincaid

Post #1360802
Posted Tuesday, September 18, 2012 8:11 AM


Hall of Fame

Hall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of FameHall of Fame

Group: General Forum Members
Last Login: 2 days ago @ 12:36 PM
Points: 3,466, Visits: 1,813
Charles Kincaid (9/18/2012)
DB.Duck (9/17/2012)
CELKO (9/9/2012)
...we do not use BIT flags in SQL...

You, sir, couldn't be more wrong. That is all.


Actually it's almost a great point but poorly stated. Let me give this a shot. For the most part you should avoid bit flags.

Often they are used to replicate data. Case in point is the [TerminationDate] of an employee. That column will be NULL for everyone who is still working and have a valid date for folks who have, or were, terminated. Therefore a [Terminated] bit column is not needed.

Then there is the whole aspect of indexing on bits. SQL Server is a quite fine product (all software has bugs so don't start) but even the greatest software has to make compromises. Indexing on a bit suffers. I'll leave further reading to you. There are many good articles and books.

Bits are tempting. Recent versions will cram 8 bits into one byte of actual storage. Sounds cool. You create the first one at design time. You are sucked into a false sense as adding the next seven are very fast (only the meta data is changed). Now you add the ninth and wait as every page in your table is rewritten.


Ok so I'm willing to accept that bit flags should be used when appropriate. For example when you want to know if a professor has tenure or not, but don't care when they got it. At that point putting in a date field is wasteful. Or possibly a better example would be if you are storing the results of a questionnaire and have a number of true/false or yes/no questions. It's really all about what your data requirements are.

I haven't read about the indexing problem of bit's but I certainly will at this point. Either way though I don't imagine it being all that helpful as the cardinality of a bit is bound to be terrible.

As far as that 9th bit column. I could be wrong (and that happens far more frequently than I would like) but wouldn't the 9th bit column (adding 1 byte to the row) be the same effect as adding a char(1), tinyint, or really any data type?



Kenneth Fisher
I strive to live in a world where a chicken can cross the road without being questioned about its motives.
--------------------------------------------------------------------------------
For better, quicker answers on T-SQL questions, click on the following...
http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better answers on performance questions, click on the following...
http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Link to my Blog Post --> www.SQLStudies.com
Post #1360813
Posted Thursday, September 20, 2012 1:20 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 8:03 AM
Points: 825, Visits: 319
thisisfutile (9/11/2012)


Like in a car, we don't reference left and right sides because that's relevant to whether or not I'm looking at it from the back or from the front. Of course, referencing driver-side and passenger-side is different based on the country you live in. Now that I think about it, like boats, I'm all for using port and starboard for cars.


We use 'near side' for the side nearest the kerb and 'off side' for the other side. Then it doesn't matter which country you are in or where the driver and front seat passenger are in the car. True, this doesn't work for off-road driving but that's something I have no experience of so I'm not bothered.
Post #1361765
Posted Thursday, September 20, 2012 1:49 AM
SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Tuesday, April 15, 2014 8:03 AM
Points: 825, Visits: 319
Often they are used to replicate data. Case in point is the [TerminationDate] of an employee. That column will be NULL for everyone who is still working and have a valid date for folks who have, or were, terminated. Therefore a [Terminated] bit column is not needed.


I would regard this as overloading the [TerminationDate] column by giving it two meanings (terminated yes or no and date thereof). In addition to a column on the main table that states the current status, I would generally have a table with a foreign key of the main table and two columns of 'status' and 'date/time of status' (maybe others such as 'who set the status', etc.) so I could view all changes of status in chronological order, thus allowing for a status going to one value and then to another. For the employee, I could then see the dates of actions such as 'hired', 'suspended', 'terminated', 'rehired'. I can add new status by allowing new values in the 'status' column on the subsidiary table rather than having to add new columns for each new status to the main table.
Post #1361776
Posted Wednesday, October 24, 2012 12:03 PM


SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Tuesday, September 9, 2014 5:00 PM
Points: 91, Visits: 195
Preserved and Unpreserved:

I often use the terms "preserved" and "unpreserved" table. I think it's from some ORACLE documentation that I read about 15+ years ago, but I find it useful.


Bit Columns:

In designing a truly normalized schema, I would do my best to avoid having a bit column in a table. Joe's example of Termination_Date versus Terminated_Y_N is a good one. It's a more "correct" way of getting to a fully normalized solution.

Also, you can create an index on a Bit column, but it would be pretty much useless, as its not at all selective - the optimizer will probably never use it.

Having said all that, I still liked the article - you got across the main point, and I liked the way you used the FULL JOIN to show the pattern and illustrate the principles involved.



Post #1376612
Posted Friday, January 17, 2014 4:47 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Monday, June 30, 2014 8:48 AM
Points: 16, Visits: 116
It seems to me that a lot of people is not getting really HOW a query is executed.
But that should be querying 101.

I would just point out the T-SQL Querying book by Itzik Ben-Gan, but let me recap.

A T-SQL query is composed by clauses, those are:

SELECT TOP(x) ....
FROM ...
WHERE ...
GROUP BY ....
ORDER BY ...

But this is a description of the query, not the real execution order.

If we where writing queries with clauses in the exact execution order, it would be in this way:

FROM ...
WHERE ...
GROUP BY ...
SELECT ...
ORDER BY ...
TOP ...

This is why you can write "Order by 1" and have the data sorted by the first column: because when the order by gets executed, the selected data is already there. However, you cannot use the expressions in the SELECT clause in the WHERE, FROM or GROUP BY clauses, because when those are executed, the select has still to be materialized.

What seems to trouble a lot of people is failing to grasp that the ON clause and the WHERE clause are executed in different moments. And, in between them, the OUTER kicks in.

So, the ON is used to match data for the join, and AFTER the join is evaluated, 1 of 4 things happens:
- if the join is INNER, nothing happens and no data is appended. This is why an INNER join will be always at least as fast as an outer
- if the join is LEFT, RIGHT or FULL OUTER, the non-joining data from the related table(s) will be added to the result set.

After this result set given by the FROM clause is built, the WHERE kicks in. No magic or difficult considerations.

Just always, ALWAYS remember the order of execution.

Everything else in the article is, IMHO, unnecessary clutter.
Post #1532000
Posted Friday, January 17, 2014 7:11 AM


SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Saturday, September 27, 2014 5:20 AM
Points: 467, Visits: 868
"Mr Pepper" ? Surely that's the only real Doctor present.

I found this thread both amusing and enlightening on many levels.
Post #1532044
Posted Friday, January 17, 2014 7:56 AM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Friday, January 31, 2014 8:34 AM
Points: 1, Visits: 5
I still get around 1 or 2 questions a week on this subject. Which is why I wrote the article in the first place.


To my mind, that's an excellent reason for writing the article. Many people using SQL are not expert and an article like this helps them learn.
Post #1532068
Posted Friday, January 17, 2014 8:07 AM
SSC Rookie

SSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC RookieSSC Rookie

Group: General Forum Members
Last Login: Friday, September 26, 2014 2:19 AM
Points: 26, Visits: 190
@Kenneth Fisher: Thx for the article, it's most certainly something that gets a lot of people confused and the given examples are a great help to show what's going on.
On top of that, the part

“Why is that! “ I hear some of you cry. The rest are split between 75% who already know the answer, 15% who don’t care, and the remaining 10% are still trying to figure out what an OUTER JOIN is and why I’m going on about it.

will have me smiling for the rest of the day =)

@Celko: Although I understand the need to 'preach against bad design', IMHO you're picking the wrong fight here and most certainly could have used more respectful wording.



Post #1532075
« Prev Topic | Next Topic »

Add to briefcase «««23456»»»

Permissions Expand / Collapse