Finding min and max date within a sub-group

  • My task is to find the minimum and maximum date for the Code contained in the latest row. A picture will help...

    As you will see in the attached example, I need t-sql to 1) find the maximum date for the EMPLOYEE 2) capture the JOB_CODE related to the max date 3) find the minimum date (going backward in EFFECTIVE_DATE descending) related to the specific JOB_CODE.

    You will notice the JOB_CODE can come and go in date order. I only want the min date for the latest set of continuous JOB_CODE rows.

    In the attached example, I want to end up with EFFECTIVE_DATE = '2009-08-17' and END_DATE = '2011-06-09' as based on JOB_CODE = 'T009'.

    Thanks for the help.

  • Try this

    SELECT

    EMPLOYEE,

    MIN(EFECT_DATE) AS EFECT_DATE,

    MAX(END_DATE) AS END_DATE,

    JOB_CODE

    FROM TABLE

    GROUP BY EMPLOYEE,JOB_CODE

    HAVING JOB_CODE ='T009'

  • Can you post some DDL and sample data?

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • From what I understood from reading your requirement and looking at the image you uploaded....i have come up with the following....I think it should work for you:

    Select Employee, MIN(Effect_Date) As MinEffectDate, MIN(End_Date) From Table

    Where JobCode = (Select JobCode From Table Where Effect_Date = (Select MAX(Effect_Date) From Table))

    Group By Employee

    If this does not work for your requirement the please visit the link in my signature and post DDL and sample data as shown in the link.....it is very tedious to create a sample data set by looking at the image you uploaded and the people here would not have the time to do so.

    If you post the DDL and Sample data as shown in the link in my signature, then you might have a better chance of getting your problem solved.

    EDIT(07/11/2012): There is a slight edit in this code......I forgot to add Group By in the code....you'll have to group by Employee.

    Vinu Vijayan

    For better and faster solutions please check..."How to post data/code on a forum to get the best help" - Jeff Moden[/url] 😉

  • CELKO (11/6/2012)


    Does your boss draw pictures when he wants to give you programming specs?

    🙁 Unfortunately, yes. However, he pays me to use the pictures and we're all volunteers here.

    We need some help, to help, we need DDL and consumable sample data.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • This should work if I am understanding your problem correctly. Please test it with different test cases.

    I have assumed that there is a identity row id without gaps for each row, and in the order your data is in. Also, when you said "find the maximum date for the EMPLOYEE", I assumed you meant the maximum EndDate; if it is EffectDate, replace it accordingly.

    There are 2 solutions; Solution 1 is useful if your most recent date is always in the last row (row with the highest RowId); otherwise check out Solution 2.

    There are certain aspects where you can improve the performance; but this can be a good starting point. Again, please test the code with various test cases.

    CREATE TABLE #a(Employee INT, Effect_Date DATETIME, End_Date DATETIME, Job_Code VARCHAR(10), RowId INT IDENTITY(1, 1))

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '11/12/1992', '6/17/1993', 'T065'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '6/18/1993', '8/31/1993', 'T065'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '9/1/1993', '7/31/2001', 'T065'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '8/1/2001', '9/19/2004', 'T083'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '9/20/2004', '7/31/2007', 'T083'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '8/1/2007', '8/15/2007', 'T079'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '8/16/2007', '8/12/2009', 'T079'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '8/16/2007', '6/12/2008', 'T009'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '6/13/2008', '8/13/2008', 'T009'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '8/14/2008', '6/11/2009', 'T009'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '6/12/2009', '8/16/2009', 'T009'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '8/13/2009', '8/14/2012', 'T079'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '8/17/2009', '12/31/2009', 'T009'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '1/11/2010', '8/12/2010', 'T009'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '8/16/2010', '12/12/2010', 'T009'

    --INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '8/16/2012', '12/12/2013', 'T009'

    INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '12/13/2010', '6/9/2011', 'T009'

    --INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '6/10/2011', '6/30/2011', 'T011'

    --INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '7/1/2011', '9/30/2011', 'T011'

    --INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '10/1/2011', '10/28/2011', 'T009'

    --INSERT INTO #a(Employee, Effect_Date, End_Date, Job_Code) SELECT 1, '10/29/2011', '12/30/2011', 'T009'

    --------Solution 1. (If your Max Date is always in the last row; use this)

    /*

    For a given employee, join on rows where the Job Code is different, but the rowid in the left table is lower than the row-id on the right table.

    This means that the last continuous sets of Employee-Job_Codes will have null values on the right hand side.

    And since your Max date is in the last row, then a min and max on the left hand row-ids where right hand row-ids are null should give you the rows that contain your solution data.

    */

    SELECT A1.Employee, A1.Job_Code, MIN(A1.Effect_Date) AS Effect_Date, MAX(A1.End_Date) AS End_Date

    FROM #a A1

    LEFT JOIN #a A2 ON (A1.Employee = A2.Employee AND A1.Job_Code <> A2.Job_Code AND A1.RowId < A2.RowId)

    WHERE A2.RowId IS NULL

    GROUP BY A1.Employee, A1.Job_Code

    --------Solution 2. (If your Max Date is not always in the last row; use this. This should work even if your max date is in the last row)

    /*

    In this solution, first get the MinRow and Maxrow for each continuous sets of Employee-Job_Codes.

    To do this, join on rows where the Job_Code on Right table is different but left RowId is less than right RowId and then get a minimum of Right RowId (lets call this MinRowId); Then each Employee-Job_Code-MinRowId represents a continuous set

    Then get that row from this intermediate result table which contains the row-id where max date is present and use it to get the related dates

    */

    ; WITH D1 (Employee, Job_Code, Effect_Date, End_Date, RowId, MinRowId) AS -- Step 1 of [Gets MinRow and Maxrow for each Continuous Sets of Employee-Job_Codes]

    (

    SELECT A1.Employee, A1.Job_Code, A1.Effect_Date AS Effect_Date, A1.End_Date AS End_Date, A1.RowId, MIN(A2.RowId) AS MinRowId

    FROM #a A1

    LEFT JOIN #a A2 ON (A1.Employee = A2.Employee AND A1.Job_Code <> A2.Job_Code AND A1.RowId < A2.RowId)

    GROUP BY A1.Employee, A1.Job_Code, A1.Effect_Date, A1.End_Date, A1.RowId

    ),

    D2 (Employee, Job_Code, MinRowId, MaxRowId) AS -- Step 2 of [Gets MinRow and Maxrow for each Continuous Sets of Employee-Job_Codes]

    (

    SELECT A.Employee, A.Job_Code, MIN(A.RowId) AS MinRowId, MAX(A.RowId) AS MaxRowId

    FROM D1 AS A

    GROUP BY A.Employee, A.Job_Code, A.MinRowId

    ),

    MRD(Employee, RowId, DtRnk) AS -- Row for [Most Recent Effect_Date]

    (

    SELECT A1.Employee, A1.RowId, ROW_NUMBER() OVER (PARTITION BY A1.Employee ORDER BY A1.Employee, A1.Effect_Date DESC) AS Rnk -- Change Effect_Date to End_Date in the Rnk if your max date has to be End_Date

    FROM #a A1

    )

    SELECT D2.Employee, D2.Job_Code, AMIN.Effect_Date, AMAX.End_Date

    FROM MRD M

    INNER JOIN D2 ON (M.Employee = D2.Employee AND M.RowId BETWEEN D2.MinRowId AND D2.MaxRowId) -- Max Date Row is Between MinRow and MaxRow of a Employee-Job_Code cOntinuous set.

    INNER JOIN #a AMIN ON (D2.Employee = AMIN.Employee AND D2.Job_Code = AMIN.Job_Code AND D2.MinRowId = AMIN.RowId)

    INNER JOIN #a AMAX ON (D2.Employee = AMAX.Employee AND D2.Job_Code = AMAX.Job_Code AND M.RowId = AMAX.RowId)

    WHERE M.DtRnk = 1

    DROP TABLE #a

  • Sorry for not giving sample data.

    Sam, your answer works. Thank you very much. 😀

    I used the first script since the last row is always the highest date.

    I did have to add a RowID.

    If anyone is interested, I attached a spreadsheet containing Sheet1 with raw data and Sheet2 with results.

    Thanks to all who replied!

  • Ron,

    Even if the Excel file is better than an image, it's not the best option to post sample data.

    There are people that won't download any files (due to company policies or personal preferences) and it's not in the best format to use it as it is.

    You should post your sample data the way Sam did. Take a look at the article linked in my signature for more information for your future posts.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thanks for the tips, Luis.

    Since I am learning here, does Sam get points for answering my question ?

  • A simple reply is enough, there are no extra points for correct answers.

    When you spend more time in this forum, you'll find out that answers that might seem correct, could be corrected to perform better or fix a bug.

    Welcome to SQL Server Central.

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply