Query tuning - Left Join vs. Inner Join vs. main table select

  • I have a main record table, joined in one-to-many relationships with around 20 aux tables. When filtering the main table, I may use conditions from fields containing text data, as well as from the linked tables via the foreign key. I've been playing with various methods of assembling the queries, and come up with several alternatives, but I'm sort of stuck on which (if any) are better or worse. The text fields are obvious - no choice there, but there are several ways to word the conditions regarding the aux tables. I tried three here: selecting on the foreign key field in the main table, selecting on the primary key in the aux table joined via a left join (there may be other conditions that will evaluate to TRUE, in which case I want the record, even though it didn't pass this test) and selecting on an embedded select clause joined with an inner join.

    #1. SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    LEFT JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID

    WHERE P.GenusAutoID = 35423

    #2. SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    LEFT JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID

    WHERE G.GenusAutoID = 35423

    #3.SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    INNER JOIN

    (SELECT GenusAutoID, Genus FROM dbo.vwTableOfGenuss WHERE GenusAutoID = 35423) G

    ON P.GenusAutoID = G.GenusAutoID

    The three execution plans are attached as well - I hope I did that right. These queries are extremely simplified, for test purposes. The actual queries return more fields and join more tables.

    Versions 2 and 3 appear identical, so it seems that the optimizer is parsing them down to the same final result. Plan 1 looks simpler in the graphic view, but of course that means nothing from an execution standpoint. Here the differences are trivial anyway, since the condition is simple - the response time is instant. But what happens when I join more tables, and add text fields?

    Is there likely to come a point when the specified subquery and inner join of #3 make a difference?

    If I have more fields, and text fields, will the optimizer decide that it's easier to do everything from the main table, rather than first gathering many aux tables?

    Is the optimizer even capable of making such a decision, or will it obediently test exactly the fields I specify, even if testing the other side of the linking key might be better?

    It seems to me that maybe it is that smart:

    - In the first query, I get the same results whether I specify a left or inner join, although the query plans are different.

    - In the second query, both the execution plan and results are the same for left or inner. In fact, it looks like the same plan as for Query #1 using the inner join.

    - In the third query, the results are wrong (all records) when I use left, correct when I use inner. (Obviously, the plans are different.) The correct plan again looks the same as the inner join plan for Query #1 and both versions of Query #2.

    Is this even worth investigating? Theoretically, many conditions are possible, but in practice, I would rarely expect more than two or three. If this sort of nit-picking generates benefits only at some fairly high level of complexity, I can use my time better elsewhere.

  • I can't see a difference between queries 1 and 2... Did you paste the right one in

    You should also note that because you have a filter on the right-hand table of a left join that filters out nulls, you've essentially converted the join to an inner join. Your query is actually this:

    SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    INNER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID

    WHERE G.GenusAutoID = 35423

    That's why 3 is identical in plan and meaning.

    If you want a left join, the query needs to be

    SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    LEFT JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID

    WHERE G.GenusAutoID = 35423 OR G.GenusAutoID IS NULL

    In which case query 2 would be logically different queries from 3 and return different results.

    Simple rewrites like this are not going to result in a different plan (unless your syntax change results in a different query) as by the time the optimiser sees it, the query's been converted into a logical tree.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (6/4/2012)


    I can't see a difference between queries 1 and 2... Did you paste the right one in

    It looks like it is in the WHERE condition and the alias is G for one and P for the other. To the OP, this will make no difference as the join has already occurred (I believe).

    Jared
    CE - Microsoft

  • Ok, now I see the difference...

    That's changed the logical meaning of the query. The first query is a left join, the filter is done on the let-hand table of a left join. The second query, as I posted, is actually an inner join because of the filter on the right-hand table of a left join that eliminates nulls. So, the three queries you posted are actually this logically:

    1)

    SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    LEFT OUTER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID

    WHERE P.GenusAutoID = 35423

    2

    SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    INNER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID

    WHERE G.GenusAutoID = 35423

    3

    SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    INNER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID

    WHERE G.GenusAutoID = 35423

    If you took your third query and turned it to a left join, it would simplify to this

    SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    LEFT OUTER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID AND G.GenusAutoID = 35423

    Which is logically different from all of the others.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (6/4/2012)


    Ok, now I see the difference...

    Sorry, I ran out for some food.

    That's changed the logical meaning of the query. The first query is a left join, the filter is done on the let-hand table of a left join. The second query, as I posted, is actually an inner join because of the filter on the right-hand table of a left join that eliminates nulls.

    So SQL Server will actually use the join it thinks is best, in spite of what I specify? Interesting, I didn't know the optimizer went that far with it.

    If you took your third query and turned it to a left join, it would simplify to this

    SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    LEFT OUTER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID AND G.GenusAutoID = 35423

    Which is logically different from all of the others.

    Would that make more sense, to do it that way? Again, I'm joining around twenty tables, so the syntax gets a little complicated, especially since I'm assembling the query in code. If I can use a left join everywhere, instead of having to code sometimes an inner, sometimes a left, it would make things a bit simpler.

    And I'm a little confused here:

    LEFT OUTER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID AND G.GenusAutoID = 35423

    A join should mention both tables in each clause, shouldn't it? What is 'G.GenusAutoID = 35423' doing in the ON clause?

  • For instance, there's no reason to use a JOIN if you use fields from one table. Instead of doing this:

    SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    LEFT JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID

    WHERE P.GenusAutoID = 35423

    You could simply do this:

    SELECT P.PodrobnostiAutoID

    FROM dbo.vwPodrobnosti P

    WHERE P.GenusAutoID = 35423

    Unless there's something else missing in the requirements that you haven't shared with us.

    So SQL Server will actually use the join it thinks is best, in spite of what I specify? Interesting, I didn't know the optimizer went that far with it.

    SQL Server won't change the join type, but if you include a column from a table joined with an OUTER JOIN in the WHERE clause, it will work as an INNER JOIN.

    A join should mention both tables in each clause, shouldn't it? What is 'G.GenusAutoID = 35423' doing in the ON clause?

    It's not necessary to mention both tables in each clause, just the table you're joining. This is useful on outer joins.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • pdanes (6/4/2012)


    GilaMonster (6/4/2012)


    Ok, now I see the difference...

    Sorry, I ran out for some food.

    That's changed the logical meaning of the query. The first query is a left join, the filter is done on the let-hand table of a left join. The second query, as I posted, is actually an inner join because of the filter on the right-hand table of a left join that eliminates nulls.

    So SQL Server will actually use the join it thinks is best, in spite of what I specify? Interesting, I didn't know the optimizer went that far with it.

    No, your query form (the predicate in the where clause) changed the join to an inner join, so SQL did exactly what you told it.

    Again, I'm joining around twenty tables, so the syntax gets a little complicated, especially since I'm assembling the query in code. If I can use a left join everywhere, instead of having to code sometimes an inner, sometimes a left, it would make things a bit simpler.

    You use whatever join gives you the results you need. If you need only rows that match to the second table, use inner join. If you also need the rows that don't have a match, you use left join. It's about the logic of the query.

    And I'm a little confused here:

    LEFT OUTER JOIN dbo.vwTableOfGenuss G ON P.GenusAutoID = G.GenusAutoID AND G.GenusAutoID = 35423

    A join should mention both tables in each clause, shouldn't it? What is 'G.GenusAutoID = 35423' doing in the ON clause?

    No, a join can have any predicate in it. It's doing exactly what the filter in the subquery did when you had left join, filtering before joining. Makes no difference with an inner join, changes the query's logic with a left join.

    http://www.sqlmag.com/blog/sql-server-questions-answered-28/sql-server/determining-the-position-of-search-arguments-within-a-join-137108

    None of this is about performance. It's just about the correct query to get the data that you want.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Luis Cazares (6/4/2012)


    Unless there's something else missing in the requirements that you haven't shared with us.

    Yes, there is. As I mentioned in the original post, these queries are extremely simplified for the purposes of this test. I do actually use other fields from these joined tables, and in fact, have displays that concatenate some of these fields and display the results, whether an individual record does or does not have an entry from a particular table. Some of it is taxonomy, for example, and I must display the entire taxonomic tree (Class, Order, Family, Genus...), even though some entries may not be present in a particular record. This why I use Left Joins, because I need the record regardless, and do the filtering on which records I actually retrieve elsewhere. Otherwise, you're right, it would be simpler to just filter on the main table and not join at all. Unfortunately, I can't do that.

    So SQL Server will actually use the join it thinks is best, in spite of what I specify? Interesting, I didn't know the optimizer went that far with it.

    SQL Server won't change the join type, but if you include a column from a table joined with an OUTER JOIN in the WHERE clause, it will work as an INNER JOIN.

    I guess that makes sense - if I have a condition on the table I'm joining, I'm restricting the possible joined members, which is much the same as actually using an Inner Join. It's sometimes tough to wrap my skull around what's actually being specified at each stage in the query.

    A join should mention both tables in each clause, shouldn't it? What is 'G.GenusAutoID = 35423' doing in the ON clause?

    It's not necessary to mention both tables in each clause, just the table you're joining. This is useful on outer joins.

    Well, in that case I still don't understand what the 'G.GenusAutoID = 35423' phrase is doing here. When you say join ON something, you have to say what in each table is to be the link. Here it's a simple condition, testing against a literal. What role does it play in this join clause?

  • pdanes (6/4/2012)


    When you say join ON something, you have to say what in each table is to be the link.

    No you don't.

    This is perfectly legal and valid.

    SELECT * FROM t1 INNER JOIN t2 ON 1 = 0

    It'll never return data, but it's perfectly legal. See the link I posted.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (6/4/2012)


    See the link I posted.

    Sorry, Gail, my display didn't refresh before I posted that reply to Luis. Looking at that now...

  • pdanes (6/4/2012)


    Well, I've advanced some, at least to a higher state of confusion. It certainly appears that the subquery syntax (#3) was unnecessarily complicated and brought no improvements - the parser simply irons it out to be the same as when it's done in a more straighforward fashion. But I'm still a bit fuddled on the merits or lack thereof of the various WHERE clauses. Possibly I'm not explaining myself clearly enough - it wouldn't be the first time.

    I actually have two 'conditions' of sorts - one is the join, the other is the filter. Here is the actual SELECT statement in al its glory:SELECTdbo.fnsFormatAkcesit(A.AkcesitPred, A.Akcesit, A.Rok) AS Akc,

    dbo.fnsFormatDrEvid(P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra) AS DrEvid,

    dbo.fnsTaxonomieFormatted(tGe.Genus, tSGe.SubGenus, tSp.Species, tSSp.SubSpecies, P.Authors, tF.Family,

    tOr.[Order], tGr.[Group]) AS Taxonomie, dbo.fnsStratigrafie(tSy.System, tSe.Series,

    tSt.Stage, tSSt.SubStage, tLU.LithographicUnit, tLSU.LithographicSubUnit,

    tZ.Zone) AS Stratigrafie, tC.Cesky + N' - ' + tC.Country AS Kountry, CASE WHEN Depozitar IS NULL

    THEN '' ELSE Depozitar END + CASE WHEN UlozisteDocasne IS NULL THEN '' ELSE ' {' + UlozisteDocasne + '}' END AS Lokace, P.PodrobnostiAutoID,

    P.AkcesAutoID, P.EvidenceGroup, P.GroupAutoID, P.OrderAutoID, P.FamilyAutoID,

    P.GenusAutoID, P.SubGenusAutoID, P.SpeciesAutoID, P.SubSpeciesAutoID,

    P.SystemAutoID, P.SeriesAutoID, P.StageAutoID, P.SubStageAutoID,

    P.LithographicUnitAutoID, P.LithographicSubUnitAutoID, P.ZoneAutoID, P.CountryAutoID,

    P.DepozitarAutoID, P.UlozisteDocasne, P.PDAutoID, P.ODAutoID, P.OriginAutoID,

    P.PocetKusu, P.OTHER_NO, P.Poznamka, P.Description, P.RockType,

    P.Preserv, P.Original, P.Authors, P.Lokalita, P.IDNeDruheEvidence,

    P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra, A.Rok, A.AkcesitPred, A.Akcesit,

    A.KusuVKatalogu, P.NepublikovatYN, tT.Zkratka, P.TypAutoID, left(CONVERT(varchar(100), tNI.Datum, 21),10) As Datum, tNI.Clovek

    FROM dbo.Podrobnosti P INNER JOIN

    dbo.Akces A ON P.AkcesAutoID = A.AkcesAutoID LEFT OUTER JOIN

    dbo.vwNejnovejsiInventarizace tNI ON P.PodrobnostiAutoID = tNI.PodrobnostiAutoID LEFT OUTER JOIN

    dbo.TableOfTyps tT ON P.TypAutoID = tT.TypAutoID LEFT OUTER JOIN

    dbo.TableOfSystems tSy ON P.SystemAutoID = tSy.SystemAutoID LEFT OUTER JOIN

    dbo.vwTableOfDepozitarsFormatted tDF ON P.DepozitarAutoID = tDF.DepozitarAutoID LEFT OUTER JOIN

    dbo.TableOfLithographicSubUnits tLSU ON P.LithographicSubUnitAutoID = tLSU.LithographicSubUnitAutoID LEFT OUTER JOIN

    dbo.TableOfCountrys tC ON P.CountryAutoID = tC.CountryAutoID LEFT OUTER JOIN

    dbo.TableOfLithographicUnits tLU ON P.LithographicUnitAutoID = tLU.LithographicUnitAutoID LEFT OUTER JOIN

    dbo.TableOfSeriess tSe ON P.SeriesAutoID = tSe.SeriesAutoID LEFT OUTER JOIN

    dbo.TableOfStages tSt ON P.StageAutoID = tSt.StageAutoID LEFT OUTER JOIN

    dbo.TableOfZones tZ ON P.ZoneAutoID = tZ.ZoneAutoID LEFT OUTER JOIN

    dbo.TableOfSubStages tSSt ON P.SubStageAutoID = tSSt.SubStageAutoID LEFT OUTER JOIN

    dbo.TableOfSubGenuss tSGe ON P.SubGenusAutoID = tSGe.SubGenusAutoID LEFT OUTER JOIN

    dbo.TableOfSubSpeciess tSSp ON P.SubSpeciesAutoID = tSSp.SubSpeciesAutoID LEFT OUTER JOIN

    dbo.TableOfSpeciess tSp ON P.SpeciesAutoID = tSp.SpeciesAutoID LEFT OUTER JOIN

    dbo.TableOfOrders tOr ON P.OrderAutoID = tOr.OrderAutoID LEFT OUTER JOIN

    dbo.TableOfGenuss tGe ON P.GenusAutoID = tGe.GenusAutoID LEFT OUTER JOIN

    dbo.TableOfGroups tGr ON P.GroupAutoID = tGr.GroupAutoID LEFT OUTER JOIN

    dbo.TableOfFamilys tF ON P.FamilyAutoID = tF.FamilyAutoID

    WHEREP.GenusAutoID = 11111

    ANDP.SpeciesAutoID = 22222The joins ALWAYS lists all the tables, because I need to display fields from these tables for whichever record I end up retrieving. Not always will every table have a corresponding entry in the main table; that's why I'm using a left join. I was experimenting with various ways to phrase those, trying to see if I could improve on the situation, but that seems to be pointless. However, the WHERE clause will vary quite a lot. And here is where I'm not clear - whether I should use 'P.GenusAutoID = 11111' or 'dbo.TableOfGenuss.GenusAutoID = 11111' in the WHERE clause. The query plans vary quite a lot, sometimes even when I simply change the value of the literal, which I don't understand at all why that would happen.

    But I can't figure out from the query plans whether one way is provably better than the other - I simply don't understand what I'm looking at well enough, in spite of having read many articles on analyzing query plans. On one hand, it seems that limiting the aux tables might result in less data getting shoveled around unnecessarily. On the other hand, if all the comparisons are made against one (the main) table, ESPECIALLY when I include text fields from the main table which cannot be read from anywhere else, it seems that limiting the scan to one table might be faster than looking at many tables.

    Is one way better than the other? A little? A lot? Not at all? Or is SQL Server by any chance smart enough to understand the relationship between the tables and do things its own way, regardless of how I code it? That would be ideal, and I wouldn't even be surprised if that information is in the query plan, for those with the wit to comprehend it. I'll create a few test examples and post the actual plans, if it would help, but they're pretty messy.

  • It has nothing whatsoever to do with performance. The meaning of the two queries is different. That's why you're getting different execution plans.

    Simplified example:

    CREATE TABLE #t1 (

    ColourID INT IDENTITY PRIMARY KEY,

    Colour varchar(20)

    )

    CREATE TABLE #t2 (

    FruitID INT IDENTITY PRIMARY KEY,

    Fruit varchar(20),

    ColourID int

    )

    INSERT INTO #t1 (Colour)

    VALUES ('Red'), ('Green'), ('Blue'), ('Yellow')

    INSERT INTO #t2 (Fruit, ColourID)

    VALUES ('Strawberry', 1), ('Blueberry',3), ('Banana',4)

    SELECT * FROM #t1 LEFT OUTER JOIN #t2 ON #t1.ColourID = #t2.ColourID

    WHERE #t1.ColourID = 2

    SELECT * FROM #t1 LEFT OUTER JOIN #t2 ON #t1.ColourID = #t2.ColourID

    WHERE #t2.ColourID = 2

    DROP TABLE #t1

    DROP TABLE #t2

    Use the one that produces the correct output.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (6/5/2012)


    It has nothing whatsoever to do with performance. The meaning of the two queries is different. That's why you're getting different execution plans.

    But Gail, they both produce EXACTLY the same results - that's why I'm writing this. If one produced incorrect results, I wouldn't be pestering the forum with the question at all. Different queries, different execution plans, identical results (both correct), in the same order, even. Here are the two queries, with the plans attached. the first has the WHERE condition directed to the main table alias (P), the second to the aux table aliases (tGe and tSp).

    SELECTdbo.fnsFormatAkcesit(A.AkcesitPred, A.Akcesit, A.Rok) AS Akc,

    dbo.fnsFormatDrEvid(P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra) AS DrEvid,

    dbo.fnsTaxonomieFormatted(tGe.Genus, tSGe.SubGenus, tSp.Species, tSSp.SubSpecies, P.Authors, tF.Family,

    tOr.[Order], tGr.[Group]) AS Taxonomie, dbo.fnsStratigrafie(tSy.System, tSe.Series,

    tSt.Stage, tSSt.SubStage, tLU.LithographicUnit, tLSU.LithographicSubUnit,

    tZ.Zone) AS Stratigrafie, CASE WHEN Depozitar IS NULL

    THEN '' ELSE Depozitar END + CASE WHEN UlozisteDocasne IS NULL THEN '' ELSE ' {' + UlozisteDocasne + '}' END AS Lokace, P.PodrobnostiAutoID,

    P.AkcesAutoID, P.EvidenceGroup, P.GroupAutoID, P.OrderAutoID, P.FamilyAutoID,

    P.GenusAutoID, P.SubGenusAutoID, P.SpeciesAutoID, P.SubSpeciesAutoID,

    P.SystemAutoID, P.SeriesAutoID, P.StageAutoID, P.SubStageAutoID,

    P.LithographicUnitAutoID, P.LithographicSubUnitAutoID, P.ZoneAutoID, P.CountryAutoID,

    P.DepozitarAutoID, P.UlozisteDocasne, P.PDAutoID, P.ODAutoID, P.OriginAutoID,

    P.PocetKusu, P.OTHER_NO, P.Poznamka, P.Description, P.RockType,

    P.Preserv, P.Original, P.Authors, P.Lokalita, P.IDNeDruheEvidence,

    P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra, A.Rok, A.AkcesitPred, A.Akcesit,

    A.KusuVKatalogu, P.NepublikovatYN, tT.Zkratka, P.TypAutoID, left(CONVERT(varchar(100), tNI.Datum, 21),10) As Datum, tNI.Clovek

    FROM dbo.Podrobnosti P INNER JOIN

    dbo.Akces A ON P.AkcesAutoID = A.AkcesAutoID LEFT OUTER JOIN

    dbo.vwNejnovejsiInventarizace tNI ON P.PodrobnostiAutoID = tNI.PodrobnostiAutoID LEFT OUTER JOIN

    dbo.TableOfTyps tT ON P.TypAutoID = tT.TypAutoID LEFT OUTER JOIN

    dbo.TableOfSystems tSy ON P.SystemAutoID = tSy.SystemAutoID LEFT OUTER JOIN

    dbo.vwTableOfDepozitarsFormatted tDF ON P.DepozitarAutoID = tDF.DepozitarAutoID LEFT OUTER JOIN

    dbo.TableOfLithographicSubUnits tLSU ON P.LithographicSubUnitAutoID = tLSU.LithographicSubUnitAutoID LEFT OUTER JOIN

    dbo.TableOfCountrys tC ON P.CountryAutoID = tC.CountryAutoID LEFT OUTER JOIN

    dbo.TableOfLithographicUnits tLU ON P.LithographicUnitAutoID = tLU.LithographicUnitAutoID LEFT OUTER JOIN

    dbo.TableOfSeriess tSe ON P.SeriesAutoID = tSe.SeriesAutoID LEFT OUTER JOIN

    dbo.TableOfStages tSt ON P.StageAutoID = tSt.StageAutoID LEFT OUTER JOIN

    dbo.TableOfZones tZ ON P.ZoneAutoID = tZ.ZoneAutoID LEFT OUTER JOIN

    dbo.TableOfSubStages tSSt ON P.SubStageAutoID = tSSt.SubStageAutoID LEFT OUTER JOIN

    dbo.TableOfSubGenuss tSGe ON P.SubGenusAutoID = tSGe.SubGenusAutoID LEFT OUTER JOIN

    dbo.TableOfSubSpeciess tSSp ON P.SubSpeciesAutoID = tSSp.SubSpeciesAutoID LEFT OUTER JOIN

    dbo.TableOfSpeciess tSp ON P.SpeciesAutoID = tSp.SpeciesAutoID LEFT OUTER JOIN

    dbo.TableOfOrders tOr ON P.OrderAutoID = tOr.OrderAutoID LEFT OUTER JOIN

    dbo.TableOfGenuss tGe ON P.GenusAutoID = tGe.GenusAutoID LEFT OUTER JOIN

    dbo.TableOfGroups tGr ON P.GroupAutoID = tGr.GroupAutoID LEFT OUTER JOIN

    dbo.TableOfFamilys tF ON P.FamilyAutoID = tF.FamilyAutoID

    WHEREP.GenusAutoID = 37263

    ANDP.SpeciesAutoID = 51455

    SELECTdbo.fnsFormatAkcesit(A.AkcesitPred, A.Akcesit, A.Rok) AS Akc,

    dbo.fnsFormatDrEvid(P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra) AS DrEvid,

    dbo.fnsTaxonomieFormatted(tGe.Genus, tSGe.SubGenus, tSp.Species, tSSp.SubSpecies, P.Authors, tF.Family,

    tOr.[Order], tGr.[Group]) AS Taxonomie, dbo.fnsStratigrafie(tSy.System, tSe.Series,

    tSt.Stage, tSSt.SubStage, tLU.LithographicUnit, tLSU.LithographicSubUnit,

    tZ.Zone) AS Stratigrafie, CASE WHEN Depozitar IS NULL

    THEN '' ELSE Depozitar END + CASE WHEN UlozisteDocasne IS NULL THEN '' ELSE ' {' + UlozisteDocasne + '}' END AS Lokace, P.PodrobnostiAutoID,

    P.AkcesAutoID, P.EvidenceGroup, P.GroupAutoID, P.OrderAutoID, P.FamilyAutoID,

    P.GenusAutoID, P.SubGenusAutoID, P.SpeciesAutoID, P.SubSpeciesAutoID,

    P.SystemAutoID, P.SeriesAutoID, P.StageAutoID, P.SubStageAutoID,

    P.LithographicUnitAutoID, P.LithographicSubUnitAutoID, P.ZoneAutoID, P.CountryAutoID,

    P.DepozitarAutoID, P.UlozisteDocasne, P.PDAutoID, P.ODAutoID, P.OriginAutoID,

    P.PocetKusu, P.OTHER_NO, P.Poznamka, P.Description, P.RockType,

    P.Preserv, P.Original, P.Authors, P.Lokalita, P.IDNeDruheEvidence,

    P.EvidenceLetter, P.EvidenceNumber, P.EvidenceExtra, A.Rok, A.AkcesitPred, A.Akcesit,

    A.KusuVKatalogu, P.NepublikovatYN, tT.Zkratka, P.TypAutoID, left(CONVERT(varchar(100), tNI.Datum, 21),10) As Datum, tNI.Clovek

    FROM dbo.Podrobnosti P INNER JOIN

    dbo.Akces A ON P.AkcesAutoID = A.AkcesAutoID LEFT OUTER JOIN

    dbo.vwNejnovejsiInventarizace tNI ON P.PodrobnostiAutoID = tNI.PodrobnostiAutoID LEFT OUTER JOIN

    dbo.TableOfTyps tT ON P.TypAutoID = tT.TypAutoID LEFT OUTER JOIN

    dbo.TableOfSystems tSy ON P.SystemAutoID = tSy.SystemAutoID LEFT OUTER JOIN

    dbo.vwTableOfDepozitarsFormatted tDF ON P.DepozitarAutoID = tDF.DepozitarAutoID LEFT OUTER JOIN

    dbo.TableOfLithographicSubUnits tLSU ON P.LithographicSubUnitAutoID = tLSU.LithographicSubUnitAutoID LEFT OUTER JOIN

    dbo.TableOfCountrys tC ON P.CountryAutoID = tC.CountryAutoID LEFT OUTER JOIN

    dbo.TableOfLithographicUnits tLU ON P.LithographicUnitAutoID = tLU.LithographicUnitAutoID LEFT OUTER JOIN

    dbo.TableOfSeriess tSe ON P.SeriesAutoID = tSe.SeriesAutoID LEFT OUTER JOIN

    dbo.TableOfStages tSt ON P.StageAutoID = tSt.StageAutoID LEFT OUTER JOIN

    dbo.TableOfZones tZ ON P.ZoneAutoID = tZ.ZoneAutoID LEFT OUTER JOIN

    dbo.TableOfSubStages tSSt ON P.SubStageAutoID = tSSt.SubStageAutoID LEFT OUTER JOIN

    dbo.TableOfSubGenuss tSGe ON P.SubGenusAutoID = tSGe.SubGenusAutoID LEFT OUTER JOIN

    dbo.TableOfSubSpeciess tSSp ON P.SubSpeciesAutoID = tSSp.SubSpeciesAutoID LEFT OUTER JOIN

    dbo.TableOfSpeciess tSp ON P.SpeciesAutoID = tSp.SpeciesAutoID LEFT OUTER JOIN

    dbo.TableOfOrders tOr ON P.OrderAutoID = tOr.OrderAutoID LEFT OUTER JOIN

    dbo.TableOfGenuss tGe ON P.GenusAutoID = tGe.GenusAutoID LEFT OUTER JOIN

    dbo.TableOfGroups tGr ON P.GroupAutoID = tGr.GroupAutoID LEFT OUTER JOIN

    dbo.TableOfFamilys tF ON P.FamilyAutoID = tF.FamilyAutoID

    WHEREtGe.GenusAutoID = 37263

    ANDtSp.SpeciesAutoID = 51455

  • pdanes (6/5/2012)


    GilaMonster (6/5/2012)


    It has nothing whatsoever to do with performance. The meaning of the two queries is different. That's why you're getting different execution plans.

    But Gail, they both produce EXACTLY the same results - that's why I'm writing this.

    With your particular set of data at this particular point in time. That doesn't change the fact that they are different queries and absolutely can return different data. See the example I gave.

    If you filter on the right-hand table of a left join, then you have converted the join to an inner join (as I've said before). If that's what you need (inner joins), that's fine. If it's not, then you either need to filter the left-hand table of do a filter on the join (as in the example I posted yesterday) or filter WHERE <column> = <value> OR <column> is NULL if you need to preserve the left join characteristics.

    Again, this has nothing whatsoever to do with performance, you have two different queries there that are perfectly capable of returning different results. The first one is using left joins, the second, because of the filters, is actually going to do inner joins to the TableOfSubGenuss and TableOfSpeciess tables. The fact that they are returning the same data implies that all the rows in Podrobnosti have matching rows in those two tables, so you can just change the left join to inner join, then the two queries will be semantically and logically identical.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • GilaMonster (6/5/2012)


    Hi Gail, sorry for the delay in responding, I had some fires to put out, and I wanted think about what I needed, and construct an example to help illustrate it.

    With your particular set of data at this particular point in time. That doesn't change the fact that they are different queries and absolutely can return different data. See the example I gave.

    I did study it and tried it out - naturally, it performed as you say and as I would expect. Your condition is for something that doesn't exist in the main (T2) table, 2, corresponding to 'Green'. When I change it to 3, both results are the same. Both these behaviors mirror what I get in my real data. However, here you're joining AND selecting on the same criteria. I want to ALWAYS join, and select based on user input.

    If you filter on the right-hand table of a left join, then you have converted the join to an inner join (as I've said before). If that's what you need (inner joins), that's fine.

    I need a left join when I do not specify a condition for that table, since I want any other data that may be in that table. If I do specify a condition, I want the equivalent of an inner join. That's how I initially coded that app that builds this query, with inner joins on tables where I specified a condition and left joins on tables that do not have a condition. That's how this whole thread got started. That variation of JOIN syntax turned out to be pointless, as you mention - it winds up being the equivalent of an INNER JOIN.

    Here is some sample code that may help illustrate what I'm doing:

    CREATE TABLE #m1 (FruitID INT IDENTITY PRIMARY KEY, Fruit varchar(20), ColorID int, ShapeID int, TasteID int, PriceID int, FreshID int)

    CREATE TABLE #a1 (ColorID INT IDENTITY PRIMARY KEY, Color varchar(20))

    CREATE TABLE #a2 (ShapeID INT IDENTITY PRIMARY KEY, Shape varchar(20))

    CREATE TABLE #a3 (TasteID INT IDENTITY PRIMARY KEY, Taste varchar(20))

    CREATE TABLE #a4 (PriceID INT IDENTITY PRIMARY KEY, Price varchar(20))

    CREATE TABLE #a5 (FreshID INT IDENTITY PRIMARY KEY, Fresh varchar(20))

    INSERT INTO #m1 (Fruit, ColorID, ShapeID, TasteID, PriceID, FreshID)

    VALUES

    ('Strawberry', 1, 1, 1, 1, 1),

    ('Cucumber', 2, 2, 2, 2, 2),

    ('Blueberry', 3, 3, 3, 3, 3),

    ('Banana', 4, 4, 4, 4, 4),

    ('Cayenne', 1, 2, 5, Null, 4),

    ('Tomato', 1, 3, 3, 4, 5),

    ('Broccoli', 2, Null, Null, 3, 1),

    ('Spinach', 2, Null, 1, 4, 3)

    INSERT INTO #a1(Color) VALUES ('Red'), ('Green'), ('Blue'), ('Yellow')

    INSERT INTO #a2(Shape) VALUES ('Heart'), ('Long'), ('Round'), ('Curved')

    INSERT INTO #a3(Taste) VALUES ('Sweet'), ('Sour'), ('Tangy'), ('Creamy'), ('Hot')

    INSERT INTO #a4(Price) VALUES ('High'), ('Low'), ('Medium'), ('Cheap')

    INSERT INTO #a5(Fresh) VALUES ('Yes'), ('Pickled'), ('Frozen'), ('Stale'), ('Canned')

    -- Example 1

    SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1

    LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID

    LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID

    LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID

    LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID

    LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID

    WHERE #m1.FreshID = 4

    SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1

    LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID

    LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID

    LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID

    LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID

    LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID

    WHERE #a5.FreshID = 4

    -- Example 2

    SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1

    LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID

    LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID

    LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID

    LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID

    LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID

    WHERE #m1.ColorID = 2

    SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1

    LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID

    LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID

    LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID

    LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID

    LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID

    WHERE #a1.ColorID = 2

    -- Example 3

    SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1

    LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID

    LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID

    LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID

    LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID

    LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID

    WHERE #m1.ShapeID = 2

    SELECT FruitID, Fruit, Color, Shape, Taste, Price, Fresh FROM #m1

    LEFT OUTER JOIN #a1 ON #m1.ColorID = #a1.ColorID

    LEFT OUTER JOIN #a2 ON #m1.ShapeID = #a2.ShapeID

    LEFT OUTER JOIN #a3 ON #m1.TasteID = #a3.TasteID

    LEFT OUTER JOIN #a4 ON #m1.PriceID = #a4.PriceID

    LEFT OUTER JOIN #a5 ON #m1.FreshID = #a5.FreshID

    WHERE #a2.ShapeID = 2

    DROP TABLE #m1

    DROP TABLE #a1

    DROP TABLE #a2

    DROP TABLE #a3

    DROP TABLE #a4

    DROP TABLE #a5

    See that I'm always joining ALL my aux tables, but filtering on only certain conditions. When I put in the two variations of the WHERE clause, filtering on either the main table or the appropriate aux table, I get identical results, but different query plans.

    Does this make it any clearer, or am I still completely off target?

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply