Nested querries

  • I am having an embarassingly tough time with nesting querries. My insticts are wrong since I formed them using file based not table base database structures. I am trying to complete the exercises in Celko's Instant SQL Programming and having a great deal of difficulty with the chapter on joins. Do you have a book or something which would give me more practice--something with answers so I can tell if I am on the right track?

    Thanks!!

    PS Steve, I have ordered your book on 70-229. Do you have exercises in there?

    Sivea


    Sivea

  • There are lots of books, web sites and other instruction materials out there. But what are you having trouble with and we can see if we can help (but not answer). Point you in the right direction we will.

  • Don't worry about "doing my homework" for me. I am my own instructor (a bit like being you own lawyer, I think. How does that go, the one about "a fool for a lawyer?").

    I'm afraid I'm not sure how to describe in email my sticking points. I have succeed,this morning in working out one of the exercises I was having trouble with. I think. What I am looking for are more "homework" type exercises where I could compare my results with the "right" ones. I am working on simple nested querries requiring related information from 4 tables. Such as "Who ordered doodads and doohickeys on the same order?" It should be so simple but obviously I haven't grasped the fundamentals firmly enough. I DID manage to work this one out:

    select distinct c1.cname

    from Order as o1, OrderItems as i1, Orderitems as i2, Customers as c1

    where c1.custid = o1.custid

    and i1.orderid = i2.orderid

    and i1.partid = (select v1.partid

    from Inventory as v1

    where v1.description = 'doodad')

    and i2.partid = (select v1.partid

    from Inventory as v1

    where v1.description = 'doohickey')

    order by cname;

    I think I'm finally getting it but would prefer to be MUCH more in command of the syntax and what it will do (and won't do).

    BTW, suggestions for better ways to do this exercise?

    Sivea


    Sivea

  • I see, what you are doing with subqueries is fine except you lose a lot of bennifits of indexes when you do it this way as it creates a derived table that has no index on it. What I would have done is INNER JOINs instead to boost the performance of the overall query. Ex.

    SELECT DISTINCT

    c1.cname

    FROM

    Order as o1

    INNER JOIN

    OrderItems as i1

    INNER JOIN

    Inventory as v1

    ON

    i1.partid = v1.partid

    ON

    o1.orderid = i1.orderid

    INNER JOIN

    Customers as c1

    ON

    c1.custid = o1.custid

    WHERE

    v1.description IN ('doodad','doohickey')

    ORDER BY

    cname;

    Explination

    First SELECT DISTINCT and list I think you understand

    Second I think it was an error when you listed OrderItems 2x as you only have to reference it once.

    The reason for using INNER JOIN is from what I could see all your items required the same item in common and has to exist in both tables.

    The nested INNER JOIN after OrderItems to Inventory was done to process that as a whole. I have found that even thou I could have put the INNER JOIN in like the others that by doing this when 1 table is common to another but not directly related to the other tables that table is in turn related too that overall perfomance is better doing this and you will see the execution plan follow what you want better by being specific. (In this case Inventory relates to OrderItems but not to Order or Customer).

    Also if you look I don't use a subquery to get an equal on partid since there is such a commonality. Now the WHERE clause I just have to put the in the overall condition that affects what items I want to see and you will not I did IN (LIST) here. I could have written it v1.description = 'doodad' OR v1.description = 'doohickey' but IN is a shorthanded version that means the same thing so less typing. Also I do suggest doing your queries in this format for readability as it will help you see what you are doing.

    Let me know if there is anything that I did not explain to you best understanding.

  • First of all, thanks for your help to such basic questions.

    Now, issues:

    I left out a line of code when I typed it--

    "and o1.orderid = i1.orderid"

    somewhat important . . .

    But you gave me the benefit of the doubt.

    I used orderitems twice to create a self join from which I could extract orders with both doodads and doohickeys on them. That's why I used

    "where v1.description = 'doodad')

    and i2.partid = (select v1.partid

    from Inventory as v1

    where v1.description = 'doohickey')"

    to avoid the (a,b), (b,a) and (a,a) rows (as Joe Celko puts it).

    Am I overstating it, then? I built the code by working with smaller parts and elaborating it until I included all the parameters. The only way I could do it since I don't "grok" it yet.

    I tried to run your sample code but the Watcom 4 engine choked on the "ON" statements. I can't export the data from the runtime sample database that came with the book although I have created all the tables in SQL Server 7. I lost heart at the 4 double-sided pages of "insert" commands to recreate the data. This was just supposed to be a small step along the way!

    INNER JOIN creates indexes and the other method doesn't? I didn't get that. Same with OUTER, LEFT and RIGHT JOINS?

    What did you mean when you said, "I don't use a subquery to get an equal on partid since there is such a commonality"? You used "ON i1.partid = v1.partid", isn't that the same effect as "WHERE i1.partid = v1.partid"?

    Sivea

    Edited by - siveakey on 02/08/2002 3:57:20 PM


    Sivea

  • JOINs can take advantage of pre-existing indexes. They don't create them. That's what Antares is getting at. So far as joins are concerned, think of things in these terms:

    INNER JOIN:

    Whatever condition I set (e.g. TableA.Column1 = TableB.Column1) must be met. I only return rows that meet that criteria.

    OUTER JOIN:

    I will return all rows that meet the criteria, but I can return more rows that that. Here's the break down:

    LEFT OUTER JOIN:

    TableA LEFT OUTER JOIN TableB

    TableA is on the left side of this join statement. Not only am I going to return all rows that match up between TableA and TableB, I'm also planning on returning all the rest of TableA's rows as well. Since I don't have a match on the TableB side, I'll just set all the columns to NULL since there isn't a known value there.

    RIGHT OUTER JOIN:

    TableA RIGHT OUTER JOIN TableB

    TableB is on the right side of this join statement. Not only am I going to return all rows that match up between TableA and TableB, I'm also planning on returning all the rest of TableB's rows as well. Again, I don't have matching values for the TableA side, but that's okay. I'll populate the TableB columns and where I don't have a match on TableA, I'll just put NULL for the TableA columns.

    FULL OUTER JOIN:

    Not only am I going to return all the rows that match, but I'm going to return all of TableA's rows that don't have a match AND all of TableB's rows that don't have a match. Same rules apply as with the other two OUTER JOIN clauses. In other words, if there isn't a match on the TableA side, TableA columns get NULL. If there isn't a match on TableB, the TableB columns are NULL.

    CROSS JOIN:

    I will take every row in TableA and match it up with every row in Table B. There isn't a join condition. This is like saying if I have 5 boys and 4 girls and each boy and girl are required to dance with each other, I should get 20 dances. In SQL terms thats:

    SELECT Boy.Name, Girl.Name

    FROM Boy CROSS JOIN Girl

    K. Brian Kelley

    bkelley@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/bkelley/

    Edited by - bkelley on 02/08/2002 4:10:19 PM

    K. Brian Kelley
    @kbriankelley

  • quote:


    INNER JOIN creates indexes and the other method doesn't? I didn't get that. Same with OUTER, LEFT and RIGHT JOINS?


    Actually INNER JOINS can take advantage of the indexes, what happens in the other method where you did a i2.partid = (SELECT ...) the items in the subquery generate a derived table without an index so there is no major INDEX now to help with the compare (SQL will sometimes order the table in memory and if you look at the output of and execution plan you will see it does try to create some effeciency but that is something that you get into later).

    quote:


    What did you mean when you said, "I don't use a subquery to get an equal on partid since there is such a commonality"? You used "ON i1.partid = v1.partid", isn't that the same effect as "WHERE i1.partid = v1.partid"?


    Now as to why you choked on ON. You are running a SQL Server right? (That's at least what I thought I understood). What was the error and what are you using to run your queries? And as to ON and WHERE being the same, yes you are correct, the only difference is that ON makes it easier to read and I believe that conforms to the SQL92 standard (forget exactly which one). It also I believe works better at optimizing the query as the parse can make faster determinations about what it has to do.

    quote:


    I used orderitems twice to create a self join from which I could extract orders with both doodads and doohickeys on them.


    Lastly, yes you are just over stating here and that is a real common mistake people make when learning. What you have really done is double the data across the row for that table.

    Ex.

    If this is the table:

    ORDERITEMS

    OrderID PartID

    1 4

    2 68

    3 241

    With doing the way you had before by creating 2 alias (when you use the syntax OBJECT as ALIAS) and joining it to itself you have an output of.

    JOINEDVERSIONTABLE

    OrderID PartID OrderID PartID

    1 4 1 4

    2 68 2 68

    3 241 3 241

    Sorry I did not line up the columns. Again this is a common mistake but I have seen a few situations where doing that applied (Usually only tables that have a field that ties back to another field such as in an Employee database where a manager is an employee but people also have their id in the manager field.)

  • "Point you in the right direction we will." Bravo! Best posting I've read today!

    Definitely use ON for your joins rather than the where - everyone else who reads your work will appreciate it!

    Andy

  • Thank you, thank you, I will be here all week.

  • Wow! Thanks so much to both of you. I will have to take some time to digest all this wonderful instruction!! Then, try the new skills on the next exercise.

    I will perhaps have to enter the data into SQL Server 7 to take full advantage of your instruction. I can see it will be VERY worthwhile!

    Sivea


    Sivea

  • Good luck, and enjoy. Let us know if you get stuck elsewhere or something puzzles you.

  • Alright, back for more. Since last posting I have inserted all the data from Joe Celko's example code into the tables I create in SQL Server 7 (opened Access for the first time--shocking, I know--and created linked tables with forms--ooo00). So I could try the code you suggested, Antares. (Apparently the Watcom SQL runtime engine couldn't handle your code--nothing personal, I'm sure. ha ha)

    Thanks to your and Brian's tutelage, I understand the code just well enough to see what is right but not well enough to solve what is wrong. The data returned includes all orders which have doodads OR doohickeys but we need orders with doodads AND doohickeys on the same order. That is orginally why I didn't use the IN (LIST) statement and used the subqueries in my WHERE statement and a SELF JOIN to give me 2 unique partids which were from the same orderid. I was under the illusion that the INNER JOINS somehow made this unnecessary. Now, I am not so sure . . .

    Sivea


    Sivea

  • Ok, let's step through this one join at a time. For table aliasing I tend to use the first letter of each word for the table, so you'll see the following aliases:

    Customer: C

    Order: O

    OrderItems: OI1, OI2

    Inventory: I1, I2

    First, we know the connection between tables is as follows:

    Customers

    |

    Order

    |

    OrderItems

    |

    Inventory

    That's how we get from customer ultimately to doodad or doohickey in the inventory description. So let's start at the first step, the join between Order and Customers. We only need records that match on both tables, so we'll use an INNER JOIN:

    
    
    SELECT DISTINCT C.CName
    FROM Customers C
    JOIN [Order] O ON C.CustID = O.CustID

    Now, since we're looking for two items to be in the order, we're actually going to have to do a self-join like you said. Ultimately, though, we also need to join this up to our original query. Let's start there. Since we're going to join OrderItems twice, I'll call the first table OI1:

    
    
    SELECT DISTINCT C.CName
    FROM Customers C
    JOIN [Order] O ON C.CustID = O.CustID
    JOIN OrderItems OI1 ON O.OrderID = OI1.OrderID

    I've made the assumption that OrderItems has an OrderID column which can allow the join with Order. We'll use it again for a self join:

    
    
    SELECT DISTINCT C.CName
    FROM Customers C
    JOIN [Order] O ON C.CustID = O.CustID
    JOIN OrderItems OI1 ON O.OrderID = OI1.OrderID
    JOIN OrderItems OI2 ON OI1.OrderID = OI2.OrderID

    The self join is in place, but the WHERE clause is based on Inventory description, so we'll need to join Inventory to each of the OrderItems tables. Let's take it one at a time, ensuring we go ahead and put in the where clause. I'm going to look for 'doodad' in the first Inventory table (more on this in a minute):

    
    
    SELECT DISTINCT C.CName
    FROM Customers C
    JOIN [Order] O ON C.CustID = O.CustID
    JOIN OrderItems OI1 ON O.OrderID = OI1.OrderID
    JOIN OrderItems OI2 ON OI1.OrderID = OI2.OrderID
    JOIN Inventory I1 ON OI1.PartID = I1.PartID
    WHERE I1.Description = 'doodad'

    When I join Inventory a second time, but to the second OrderItems table, I'll do a check for doohickey. The first OrderItems - Inventory join takes care of doodad. The second OrderItems - Inventory join will take care of doohickey. Since I'm continuing with INNER JOINs, the only way I get a record back is if a particular Order has an OrderItem record with both a doodad and a doohickey line item. Let's do the second join and WHERE clause condition (which means we'll use an AND so that we ensure we meet both conditions):

    
    
    SELECT DISTINCT C.CName
    FROM Customers C
    JOIN [Order] O ON C.CustID = O.CustID
    JOIN OrderItems OI1 ON O.OrderID = OI1.OrderID
    JOIN OrderItems OI2 ON OI1.OrderID = OI2.OrderID
    JOIN Inventory I1 ON OI1.PartID = I1.PartID
    JOIN Inventory I2 ON OI2.PartID = I2.PartID
    WHERE I1.Description = 'doodad'
    AND I2.Description = 'doohickey'

    A similar example from the Northwind database is as follows:

    
    
    SELECT DISTINCT C.CustomerID
    FROM Customers C
    JOIN Orders O ON C.CustomerID = O.CustomerID
    JOIN [Order Details] OD1 ON O.OrderID = OD1.OrderID
    JOIN [Order Details] OD2 ON OD1.OrderID = OD2.OrderID
    JOIN Products P1 ON OD1.ProductID = P1.ProductID
    JOIN Products P2 ON OD2.ProductID = P2.ProductID
    WHERE P1.ProductName = 'Ikura'
    AND P2.ProductName = 'Boston Crab Meat'

    It's late where I am so hopefully all this makes sense and I didn't commit a typo or a simple logic error. BTW, if you take a look at the estimated execution plan on the Northwind example, it sticks with indexes.

    K. Brian Kelley

    bkelley@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/bkelley/

    K. Brian Kelley
    @kbriankelley

  • WOW, BEAUTIFUL!!! You must be on the East coast, Brian. I'm in Mountain time and I thought it was late where I am! Thank you, thank you, thank you both for all the help.

    I went to sleep trying to figure out a way to use INNER JOINs to create a similar result but kept falling back on the SELF JOIN structure from Celko's book. I knew that wasn't right. When I woke up this morning I booted everything up to try to work out the answer and there was your posting. Beleive it or not, this stuff makes sense. This is the sort of logic I am used to using in FMP, just didn't know the SQL terms for it. EXCELLENT!!

    I have to go put on a baby shower for my sister but you can bet I'll be back at this great stuff as soon as I can.

    BTW, I have to take SQL as an "independent study" (but without an instructor who can actually help me) because the community college where I'm pursuing my certifications doesn't offer SQL and the local university doesn't offer certification training. Bummer. [Oh, and no one is even looking at the stuff I'm doing with Celko. My grades are based on Transcender scores.] I have a BFA from Smith College so I need a little something technical on my CV. Getting an ACIS while I'm getting my MCSE and MCDBA. Of course, business picked up for our FMP consulting just as I'm trying to get through my last semester and my last 3 MCSE exams. Not enough for me to quit school, though. I still think SQL is the right addition to our business offerings. It's just going to take me a while longer than I thought to get good at it! Maybe someone will take pity and hire me as an assistant programmer/admin.

    Sivea

    PS I'm also going to spend some time with BOL reading up on JOINs from THAT perspective and try some more exercises with my new found knowledge!

    Edited by - siveakey on 02/09/2002 08:46:46 AM

    Edited by - siveakey on 02/09/2002 08:54:51 AM


    Sivea

  • One other place you might want to look is at an article by Neil Boyle. It covers the topic of JOINs and might also be helpful:

    http://www.sqlservercentral.com/columnists/nboyle/ansijoins.asp

    K. Brian Kelley

    bkelley@sqlservercentral.com

    http://www.sqlservercentral.com/columnists/bkelley/

    K. Brian Kelley
    @kbriankelley

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply