Avoiding a Cartesian Join - Full Outer Join On Same Table with different criteria and multiple rows either side

  • Hello

    This may have been answered before but I cannot find it.

    I am trying to create a query which consists of learners on courses.

    It needs to list the learner's name and then details of courses from last year in the first columns and then details of courses from the current year in the next columns.

    Most of the time the courses studied in each year will be different (so different course IDs).

    Something like Learner Name | Old Course | New Course

    The difficulty I am having is that the unique field is the learner's ID and they must have been enrolled on one or more courses last year and may be enrolled on none, one or more courses in the current year.

    I did a query which brought in the enrolments for last year and then a sub-select for the current year joined by the learner's ID.

    Obviously this causes a cartesian join so I get every combination of courses from last year and this year.

    The tables are basically

    Learner

    Enrolment

    Course

    Each table holds data for multiple years but has a yearID column.

    So say they were on A123 and A124 in 08/09 and B123 and B124 in 09/10 then I would get something like:

    08/09 A123 09/10 B123

    08/09 A123 09/10 B124

    08/09 A124 09/10 B123

    08/09 A124 09/10 B124

    When in fact what I want to achieve is:

    08/09 A123 09/10 B123

    08/09 A124 09/10 B124

    Obviously for people with lots of enrolments in both years this produces many extra rows.

    I'm sure there must be an easy way around this but I can't think how it could be done.

    Can anyone help?

    Thank you

    Robin

  • Sounds like homework to me...

    If so, you should show us what you've tried so far and where you get stuck.

    We might be able to give you a hint towards the right direction.

    But in order to do so you need to help us help you by providing table def, ready to use sample data, expected result and -like said before- your current approach.

    An excellent description on how to post questions can be found following the first link in my signature.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Hello lmu92

    It's not homework actually. I work as an Information Analyst in a college. I can see that it would make a good homework question though!

    I did not post the sql as I thought it might complicate things. The tables are very large and the enrolment table has over 100 columns (I am only using about 6 though in the query).

    I can not post actual data either so that is why I made up an example.

    I can post the sql if you want from work tomorrow.

    Below is a simplified example of the table structures (the real tables are actually called PEOPLE, LEARNER_AIMS and UNIT_INSTANCE_OCCURRENCES - I didn't choose those names either!)

    Thanks for any help or suggestions.

    Robin

    It is basically doing this (I have just made this up to demonstrate what I am trying to acheive - I can't run it as these tables don't actually exist):

    SELECT

    L.id,

    L.surname,

    L.forename,

    L.dob,

    C.course_code AS OLD_CRS_CODE,

    C.course_title AS OLD_CRS_TITLE,

    CUR_YR.course_code AS NEW_CRS_CODE,

    CUR_YR.course_title AS NEW_CRS_TITLE,

    CUR_YR.start_date,

    CUR_YR.exp_end_date,

    CUR_YR.act_end_Date,

    CUR_YR.completion,

    CUR_YR.outcome,

    CUR_YR.grade

    FROM learner L

    INNER JOIN enrolment E

    ON E.learner_id = L.id

    INNER JOIN course C

    ON C.id = L.course_id

    AND C.year = '08/09'

    FULL OUTER JOIN (

    SELECT

    E.learner_id AS LRN_ID,

    C.id AS CRS_ID,

    C.course_code,

    C.course_title,

    E.start_date,

    E.exp_end_date,

    E.act_end_Date,

    E.completion,

    E.outcome,

    E.grade

    FROM enrolment E

    INNER JOIN course C

    ON C.id = L.course_id

    AND C.year = '09/10'

    ) CUR_YR

    ON CUR_YR.lrn_id = L.id

    AND CUR_YR.crs_id = C.id

    ORDER BY

    L.surname,

    L.forename,

    L.id,

    L.course_code

  • Ok, you've done the first step already (create a query using fake table names).

    The next step would be to post the fake table DDL and the insert scripts for the fake data. Afte that you'd just have to add the expected output based on your sample data and we'd have something to test our solution against before posting...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • Hello Lutz

    Sorry I haven't managed to get time to do this yet but will try tomorrow.

    Robin

  • if you are trying to achieve this:

    Year, Course, Year, Course

    08/09 Math09/10 English

    08/09 Physics09/10 Math

    08/09 Chemistry

    I think you are trying to do a layout job in SQL ... that isn't right

    return the data to your presentation layer and layout the data there

  • Whoa! What's that full outer join doing there?

    If I'm understanding what you are trying to achieve correctly, you are trying to output:

    1. A column for the learner's name, then

    2. A column for the learner's previous courses for last year, and

    3. A column for the learner's current courses for this year.

    You only want to find people who have done a course last year, they may not be doing a course this year.

    I don't think this will fly! I can't see how you would get items 2 and 3 into the one query - the two just won't mesh.

    However, if you could setup the results to look something like this:

    ENROLLMENT_ID, YEAR, COURSE

    11111, 08/09, A123

    11111, 10/11, A124

    12345, 08/09, B567

    Then I think it could be doable. However, we sort of do need to know the DDL behind this... writing SQL for imaginary tables is really not something any of us want to do 🙂

    Random Technical Stuff[/url]

  • Sometimes the presentation layer is lacking. In ssrs2005, you'd have to jump through a lot of hoops to get this.

    I'm assuming that there is no relationship between a particular course last year, and a particular course this year. so in other words, it doesn't matter if abc100 shows up next to bca100 or ccc202

    Here's what you'll need to do. This won't be very efficient...

    Create a table set for the courses this year and one for the courses last year and add a row number to each one (partition by student)

    Do a full outer join between the two rowsets on the rownumber _and_ the studentID

    this will be easiest to read by using some CTEs (With clause) but you can do it just as well with inner selects like you've started to do.

    Finally, join this combined data back to the students table.



    Dan Guzman - Not the MVP (7/22/2010)
    All questions have to be prefaced by Server version and 'according to MS Docs' or 'my own personal opinion based on how much detail I felt like digging into at the time.'

  • Hello all

    Thanks for your replies. I never did get around to writing some sql for table creation as something else has been taking up all my time.

    weitzera: Thanks for your suggestion. I was wondering about using rownum but wasn't sure if it would work or how exactly to go about it.

    When I get a chance I'll give this a go and let you know if I got it working.

    Thanks

    Robin

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply