Including extra columns in a grouping

  • Greetings,

    I have what appears to be a deceptively simple question. I have a table(Jobs) with a number of columns in it(10). I am only interested in 4 of those columns. They would be:

    Project,

    Employee_ID,

    Job_Name,

    Job_Start_Date

    This table could contain, for these 4 columns:

    'Beach', 'Emp01', 'Drive', 2009-04-01

    'Beach', 'Emp01', 'Unpack', 2009-04-02

    'Beach', 'Emp01', 'Blanket', 2009-04-03

    'Beach', 'Emp02', 'Talk', 2009-05-02

    'Beach', 'Emp03', 'Surf', 2009-06-14

    'Beach', 'Emp03', 'BreakLeg', 2009-06-20

    There could be more than one Project, but for this question and for sake of size, I have given only 1. What I would like to do is to take this result set and then return from it, based on the Project and Employee_ID, the highest Job_Start_Date and it's corresponding Job_Name.

    I can write a select to get the MAX(Job_Start_Date), but then how to include the Job_Name?

    /*

    Non-Working example to show progression and stopping point due to cluelessness.

    ** No need to say this SELECT is incorrect or bad **

    */

    SELECT

    Project,

    Employee_ID,

    Job_Name, -- Needs to be the same Job_Name from the MAX(Job_Start_Date) record.

    MAX(Job_Start_Date)

    FROM Jobs

    GROUP BY Project, Employee_ID

    The final result should be:

    'Beach', 'Emp01', 'Blanket', 2009-04-03

    'Beach', 'Emp02', 'Talk', 2009-05-02

    'Beach', 'Emp03', 'BreakLeg', 2009-06-20

    I hope that this simplified example still provides enough information to help find a solution to my problem. If you need more information, please let me know and I will try to give you more.

    Thank you for your time and help.

    Terry Steadman

  • Try this maybe?

    /************************

    Including extra columns in a grouping - solution

    Lee Everest

    http://www.texastoo.com

    ************************/

    use tempdb

    go

    CREATE TABLE Project (

    Project varchar (10)

    ,Employee_ID varchar (10)

    ,Job_Name varchar (20)

    ,Job_Start_Date datetime

    )

    GO

    INSERT INTO Project VALUES ('Beach','Emp01', 'Drive', '4/1/2009')

    INSERT INTO Project VALUES ('Beach','Emp01', 'Unpack', '4/2/2009')

    INSERT INTO Project VALUES ('Beach','Emp01', 'Blanket', '4/3/2009')

    INSERT INTO Project VALUES ('Beach','Emp02', 'Talk', '5/2/2009')

    INSERT INTO Project VALUES ('Beach','Emp03', 'Surf', '6/14/2009')

    INSERT INTO Project VALUES ('Beach','Emp03', 'BreakLeg', '6/20/2009')

    GO

    WITH maxrow AS

    (

    SELECT

    Project,

    Employee_ID,

    Job_Name,

    MAX(Job_Start_Date) as MaxStart

    ,ROW_NUMBER() OVER (PARTITION BY employee_id order by employee_id) as RowNumber

    FROM project

    GROUP BY Project, Employee_ID, job_name, job_start_date

    )

    SELECT project, employee_id, job_name, maxstart

    FROM maxrow

    WHERE rownumber=1

    GO

  • Greetings Lee,

    Thank you for your reply and the code. I have not worked with Multidimensional Expressions (MDX) before. So, it may take a bit of time to study your code example and understand it enough to apply to my situation properly. Your grasp of Microsoft SQL is certainly greater than mine.

    May I ask if you might be available for more direct communication, via email maybe, so I could possibly pick your knowledge to understand this coding style better and how to apply to my situations?

    Thank you for your time.

    Terry Steadman

  • Terry,

    I took Lee's code and tweaked it just a bit to account for the fact that you can have multiple projects.

    CREATE TABLE Project (

    Project varchar (10)

    ,Employee_ID varchar (10)

    ,Job_Name varchar (20)

    ,Job_Start_Date datetime

    )

    GO

    -- TRUNCATE TABLE Project

    INSERT INTO Project VALUES ('Beach','Emp01', 'Drive', '4/1/2009')

    INSERT INTO Project VALUES ('Beach','Emp01', 'Unpack', '4/2/2009')

    INSERT INTO Project VALUES ('Beach','Emp01', 'Blanket', '4/3/2009')

    INSERT INTO Project VALUES ('Beach','Emp02', 'Talk', '5/2/2009')

    INSERT INTO Project VALUES ('Beach','Emp03', 'Surf', '6/14/2009')

    INSERT INTO Project VALUES ('Beach','Emp03', 'BreakLeg', '6/20/2009')

    INSERT INTO Project VALUES ('Lake','Emp03', 'BreakLeg', '6/20/2009')

    INSERT INTO Project VALUES ('Lake','Emp03', 'Surf', '6/14/2009')

    INSERT INTO Project VALUES ('Lake','Emp02', 'Talk', '5/2/2009')

    INSERT INTO Project VALUES ('Lake','Emp01', 'Blanket', '4/1/2009')

    INSERT INTO Project VALUES ('Lake','Emp01', 'Unpack', '4/2/2009')

    INSERT INTO Project VALUES ('Lake','Emp01', 'Drive', '4/1/2009')

    GO

    WITH maxrow AS

    (

    SELECT

    Project,

    Employee_ID,

    Job_Name,

    MAX(Job_Start_Date) as MaxStart

    ,ROW_NUMBER() OVER (PARTITION BY Project, employee_id order by Project, employee_id, Job_Start_Date) as RowNumber

    FROM project

    GROUP BY Project, Employee_ID, job_name, job_start_date

    )

    SELECT project, employee_id, job_name, maxstart

    FROM maxrow

    WHERE rownumber=1

    ORDER BY Project

    GO

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply