SQL Server JOINS

Question

Post reply

SQL Server JOINS

shaimaa.tarekelshoeiby

SSC Veteran

Points: 212
More actions
August 22, 2014 at 7:42 am

#292493

Hi all,
Can someone help explain to me how when I have 12 rows in table A and 10 in B and I do an inner join , I would get more rows than
in both A and B ?
Same with left and right joins...
This is just a simplified example. Let me share one of my issues with you
I have 2 views ; which was originally SQL on 2 base tables Culture and Trials.
And then when attempting to add another table Culture Steps, one of the team members separated the SQL into 2 views
Since this produces an error when updating(modification cannot be done as it affects multiple base tables), I would like to get
back to changing the SQL such that I no longer use the views but achieve the same results.
One of the views has
SELECT some columns
FROM dbo.Culture RIGHT JOIN
dbo.Trial ON dbo.Culture.cultureID = dbo.Trial.CultureID LEFT OUTER JOIN
dbo.TrialCultureSteps_view_part1 ON dbo.Culture.cultureID = dbo.TrialCultureSteps_view_part1.cultureID
The other TrialCultureSteps_view_part1 view
SELECT DISTINCT dbo.Culture.cultureID,
(SELECT TOP (1) WeekNr
FROM dbo.CultureStep
WHERE (CultureID = dbo.Culture.cultureID)
ORDER BY CultureStepID) AS normalstartweek
FROM dbo.Culture INNER JOIN
dbo.CultureStep AS CultureStep_1 ON dbo.Culture.cultureID = CultureStep_1.CultureID
So how can I combine the joins the achieve the same results using SQL only on tables without the need for views?

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply

Eirikur Eiriksson SSC Guru Points: 182948 More actions · Answer 1

Quick thought, could this be caused by Cartesian Product, see the example?

USE tempdb;

GO

DECLARE @TBL01 TABLE

(

T_ID INT NOT NULL

);

DECLARE @TBL02 TABLE

(

T_ID INT NOT NULL

);

INSERT INTO @TBL01 (T_ID)

VALUES (1),(2),(3),(4),(5),(6),(5),(6),(7),(8);

INSERT INTO @TBL02 (T_ID)

VALUES (1),(2),(3),(4),(5),(5),(6),(6),(7),(8),(9),(10);

SELECT

*

FROM @TBL01 T1

INNER JOIN @TBL02 T2

ON T1.T_ID = T2.T_ID;

Results

T_ID T_ID

----------- ------

1 1

2 2

3 3

4 4

5 5

6 6

5 5

6 6

7 7

8 8

gbritton1 SSCertifiable Points: 6522 More actions · Answer 2

I don't think that's a Cartesian product. If it were, you would have 89 (8*10) rows of ouput.

A CROSS JOIN would give you a Cartesian Product

gbritton1 SSCertifiable Points: 6522 More actions · Answer 3

Please post CREATE TABLE statements for all the tables involved, some sample data for each as INSERT INTO statements, and the output you want to see from your join.

Grant Fritchey SSC Guru Points: 399171 More actions · Answer 4

Do you have enforced referential constraints on the columns you're joining on? Does the data accurately map between the two tables, or is it possible that you're getting multiple matches for your JOIN criteria? That would be the most likely reason for more rows is that more values match the JOIN criteria for any given row.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

Eirikur Eiriksson SSC Guru Points: 182948 More actions · Answer 5

gbritton1 (8/22/2014)
I don't think that's a Cartesian product. If it were, you would have 89 (8*10) rows of ouput.
A CROSS JOIN would give you a Cartesian Product

A CROSS JOIN would produce a full Cartesian Product, equal duplicate values will produce partial Cartesian Product, regardless of the join type.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 6

Eirikur Eiriksson (8/22/2014)
gbritton1 (8/22/2014)
I don't think that's a Cartesian product. If it were, you would have 89 (8*10) rows of ouput.
A CROSS JOIN would give you a Cartesian Product
A CROSS JOIN would produce a full Cartesian Product, equal duplicate values will produce partial Cartesian Product, regardless of the join type.

Another of Jeff's articles to the rescue:

http://www.sqlservercentral.com/articles/T-SQL/61539/%5B/url%5D

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

shaimaa.tarekelshoeiby SSC Veteran Points: 212 More actions · Answer 7

Thank you all for your replies

Before copying any create or insert data, here is an example

Select * from Culture --2327 rows

Select * from Trial --6793 rows

Select * from CultureStep --13957 rows

Select * from Culture inner join Trial on Culture.CultureID = Trial.CultureID --6785

Select * from (Culture inner join Trial on Culture.CultureID = Trial.CultureID) inner join CultureStep

on Culture.CultureID=CultureStep.CultureID --39661

How come when the first join yields 6785 rows when inner joined with culturestep which has 13957 rows , this would yield row number greater than both (39661)?!

I am not very good at this , kindly explain with details and/or refer me to a good example to clear all kinds of joins and help me understand what results to expect

Eirikur Eiriksson SSC Guru Points: 182948 More actions · Answer 8

shaimaa.tarekelshoeiby (8/22/2014)
Thank you all for your replies
Before copying any create or insert data, here is an example
Select * from Culture --2327 rows
Select * from Trial --6793 rows
Select * from CultureStep --13957 rows
Select * from Culture inner join Trial on Culture.CultureID = Trial.CultureID --6785
Select * from (Culture inner join Trial on Culture.CultureID = Trial.CultureID) inner join CultureStep
on Culture.CultureID=CultureStep.CultureID --39661
How come when the first join yields 6785 rows when inner joined with culturestep which has 13957 rows , this would yield row number greater than both (39661)?!
I am not very good at this , kindly explain with details and/or refer me to a good example to clear all kinds of joins and help me understand what results to expect

To me it looks like a many-to-many table without a filter, don't think it is a coincidence that the cardinality of Trial = CultureStep / 2 and CultureStep = Culture x 6

Quick question, can you provide either an ERD or a create table script for the three tables?

shaimaa.tarekelshoeiby SSC Veteran Points: 212 More actions · Answer 9

Culture

CREATE TABLE [dbo].[Culture](

[cultureID] [int] IDENTITY(1,1) NOT NULL,

[CultureName] [nvarchar](50) NOT NULL,

[Process] [nvarchar](50) NULL,

[Cropspecialist] [nvarchar](50) NULL,

[Requestor] [nvarchar](50) NULL,

[CuttingsOrSeed] [nvarchar](50) NULL,

[EndLocation] [nvarchar](50) NULL,

[StartWeek] [int] NULL,

[StartWeekDeviation] [int] NULL,

[DeliveryWeek] [int] NULL,

[DeliveryWeekDeviation] [int] NULL,

[Purpose] [nvarchar](50) NULL,

[Crop] [nvarchar](50) NULL,

[EntryVariation?] [nvarchar](50) NULL,

[CultureRemarks1] [nvarchar](255) NULL,

[CultureRemarks2] [nvarchar](255) NULL,

[extra] [int] NULL,

[planningGroupID1] [int] NULL,

[extra2] [int] NULL,

[extra3] [int] NULL,

[croptype1] [nvarchar](255) NULL,

[FLS/VEG] [nvarchar](255) NULL,

[Department] [nvarchar](255) NULL,

[subProcess] [nvarchar](255) NULL,

[timeregCode] [nvarchar](255) NULL,

[createdBy] [nvarchar](255) NULL,

[dateCreated] [nvarchar](255) NULL,

CONSTRAINT [aaaaaCulture_PK] PRIMARY KEY NONCLUSTERED

Culture Step:

CREATE TABLE [dbo].[CultureStep](

[CultureStepID] [int] IDENTITY(1,1) NOT NULL,

[CultureID] [int] NULL,

[EntryVarNr] [int] NULL,

[WeekNr] [int] NULL,

[stepordernr] [int] NULL,

[days] [int] NULL,

[plants per entry] [int] NULL,

[plants per fust] [decimal](18, 1) NULL,

[extraM2] [int] NULL,

[fustPerM2(Manual)] [float] NULL,

[SoilType] [nvarchar](255) NULL,

[covering soil] [nvarchar](255) NULL,

[watering] [nvarchar](255) NULL,

[fungicide treatm after sowing] [nvarchar](255) NULL,

[name Fungicide] [nvarchar](255) NULL,

[conc Fungiside] [nvarchar](255) NULL,

[type of Hood] [nvarchar](255) NULL,

[Pincing?] [nvarchar](50) NULL,

[temperature dayNight C] [nvarchar](255) NULL,

[Neg diff] [nvarchar](255) NULL,

[Cold drop] [nvarchar](255) NULL,

[artificial light (y/n)] [nvarchar](255) NULL,

[daylength neutral (yes/no)] [nvarchar](255) NULL,

[daylength] [nvarchar](255) NULL,

[standard PGR treatment?] [nvarchar](255) NULL,

[name PGR] [nvarchar](255) NULL,

[conc PGR] [nvarchar](255) NULL,

[Spraying Schedule PGR] [nvarchar](255) NULL,

[PH] [nvarchar](255) NULL,

[EC] [nvarchar](255) NULL,

[EC in drench water] [nvarchar](255) NULL,

[fertilisation schedule vegetative] [nvarchar](255) NULL,

[fertilisation schedule generative] [nvarchar](255) NULL,

[remarks fertilisation] [nvarchar](255) NULL,

[Common diseases] [nvarchar](255) NULL,

[remarks herbicides] [nvarchar](255) NULL,

[remarks fungicides] [nvarchar](255) NULL,

[prefLocation] [nvarchar](255) NULL,

[year] [int] NULL,

[dummy] [nvarchar](50) NULL,

[plants/hour] [int] NULL,

[EntryFactor] [decimal](18, 2) NULL,

[fustID1] [int] NULL,

[Actionid1] [int] NULL,

[CultureStepNameID1] [int] NULL,

[daynr] [int] NULL,

[endweekmanual] [int] NULL,

[CultureStepRemarks] [nvarchar](255) NULL,

CONSTRAINT [aaaaaCultureStep_PK] PRIMARY KEY NONCLUSTERED

Trial:

CREATE TABLE [dbo].[Trial](

[TrialID] [int] IDENTITY(1,1) NOT NULL,

[TrialCode] [nvarchar](50) NOT NULL,

[ActualNrOfTotalEntries] [int] NULL,

[ActualNrOfEntriesVarNr1] [int] NULL,

[ActualNrOfEntriesVarNr2] [int] NULL,

[ActualNrOfEntriesVarNr3] [int] NULL,

[ActualNrOfEntriesVarNr4] [int] NULL,

[AtualStartweek] [int] NULL,

[PlannedStartWeek] [int] NULL,

[ActualStartYear] [nvarchar](50) NULL,

[CultureID] [int] NULL,

[weekOfTrialrequest] [int] NULL,

[yearOfTrialrequest] [nvarchar](50) NULL,

[remarks1] [nvarchar](200) NULL,

[remarks2] [nvarchar](200) NULL,

[RequestedNrOfTotalEntries] [int] NULL,

[RequestedStartweek] [int] NULL,

[RequestedNrOfEntriesVarNr1] [int] NULL,

[RequestedNrOfEntriesVarNr2] [int] NULL,

[RequestedNrOfEntriesVarNr3] [int] NULL,

[RequestedNrOfEntriesVarNr4] [int] NULL,

[plannedRequest] [nvarchar](255) NULL,

[trialCodeGroup] [nvarchar](255) NULL,

[OrigRequestedNrOfTotalEntries] [int] NULL,

[OrigRequestedStartweek] [int] NULL,

[OrigRequestedNrOfEntriesVarNr1] [int] NULL,

[OrigRequestedNrOfEntriesVarNr2] [int] NULL,

[OrigRequestedNrOfEntriesVarNr3] [int] NULL,

[OrigRequestedNrOfEntriesVarNr4] [int] NULL,

[PlanningAproved] [nvarchar](255) NULL,

[EntryFactorTrial] [float] NULL,

[FirstStepWithEntryFactor] [float] NULL,

[ReasonEntryFactor] [nvarchar](100) NULL,

[RequestedEntryFacorTrial] [float] NULL,

[RequestedFirstStepWithEntryFactor] [float] NULL,

[weekCorrection] [int] NULL,

[StepWCorr] [int] NULL,

[M2Correction] [int] NULL,

[StepM2Corr] [int] NULL,

[fixedDeliveryWeek] [int] NULL,

[createdBy] [nvarchar](255) NULL,

[dateCreated] [nvarchar](255) NULL,

[trialevaluation] [nvarchar](255) NULL,

CONSTRAINT [aaaaaTrial_PK] PRIMARY KEY NONCLUSTERED

Eirikur Eiriksson SSC Guru Points: 182948 More actions · Answer 10

Quick analysis

First a simplified ERD

+--------------+ +--------------+ +-----------------+

+--------------+ +--------------+ +-----------------+

| TrialID | ,---|-| CultureID |-|--, | CultureStepID |

| TrialCode | | | CultureName | '---|<| CultureID |

| CultureID |>|--' | (Culture ) | | ( CultureStep ) |

| (attributes) | +--------------+ +-----------------+

+--------------+

Looking at the ERD, we can tell that:

Each Trial has one and only one Culture.

One or more Trials can share the same Culture.

And

Each Culture has one or more CultureSteps.

One or more Cultures can share one or more CultureSteps.

Inspecting the Cardinality

Table | Row Count

------------|-----------

Trial | 6793

Culture | 2327

CultureStep | 13957

As the query "Select * from Culture inner join Trial on Culture.CultureID = Trial.CultureID" returns 6785 rows, we know that there are 8 (6793 - 6785) Trials sharing the same Culture. We can also tell that each Culture has the average of 6 CultureSteps (13957/2327 = 5.998).

Since we do not know the distribution of CultureID in the CultureStep table, the exact numbers cannot be produced but here is an approxymation:

Select * from (Culture inner join Trial on Culture.CultureID = Trial.CultureID) inner join CultureStep on Culture.CultureID=CultureStep.CultureID

Given that each CultureID appears aprox. 6 times in the CultureStep and the set Culture-Trial has 6785 entries, the expected result should have close to 6 x 6785 entries.

shaimaa.tarekelshoeiby SSC Veteran Points: 212 More actions · Answer 11

Thank you for the excellent post .

Can you please help me become capable of doing such analysis?

How were you able to make the simplified ERD and come up with the conclusions under it

I want to learn this and be good at it ...

Also , need to be good with joins and although I did read out several links with examples ...I still seem to find issues like when my brain cannot digest how an inner join would get more rows than the individual selects and so on.

Eirikur Eiriksson SSC Guru Points: 182948 More actions · Answer 12

shaimaa.tarekelshoeiby (8/25/2014)
Thank you for the excellent post .
Can you please help me become capable of doing such analysis?
How were you able to make the simplified ERD and come up with the conclusions under it
I want to learn this and be good at it ...
Also , need to be good with joins and although I did read out several links with examples ...I still seem to find issues like when my brain cannot digest how an inner join would get more rows than the individual selects and so on.

You are very welcome.

My suggestion is to grab a good book on the subject, i.e. "Database Modeling and Design" or "Data Modeling for the Business" and while reading through it, use only simple tools such as pen/paper or a text editor to work through exercises and examples. Also read up on the "Relational Database Theory", "Naive Set Theory" and "Relational Algebra". This should get you well on the way.

Another thing I find helpful, especially when tackling very complex problems, is to practice solving puzzles like Soduku by memory (not using pen and paper), keeps the grey matter functional;-)

shaimaa.tarekelshoeiby SSC Veteran Points: 212 More actions · Answer 13

Thank you ...

But with so limited time , what can I do to help me go through until I can have more time to read the book(s) you suggested and use pen/paper?

Any recommended set of articles or a small daily exercise ; something of the sort?

Grant Fritchey SSC Guru Points: 399171 More actions · Answer 14

There's no instant or magic way to build this kind of knowledge. It takes a long time. I'd add getting a copy of the book T-SQL Querying Fundamentals by Itzik Ben Gan. It's a great way to learn.

"The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
- Theodore Roosevelt

Author of:
SQL Server Execution Plans
SQL Server Query Performance Tuning

SQL Server JOINS

Cookies on SQLServerCentral