Can't figure out how to get MIN/MAX dates per grouping

  • I have a table with Employee, it lists their department, and has a row for each "position/promotion". (Just the pertinent fields below in the example shown of course and only a single employee for example)

    An employee may have a single row in a department, or if they were promoted from line staff to manager they could have many.

    Additionally, an employee may move around departments during their employment.

    I need to get the min and max dates (basically duration) of their time at each department.

    The issue is because they repeated departments when I go to group it loses movement between departments.

    Raw data:

    Employee DepartmentId StartDt EndDt

    ----------- ------------ ---------- ----------

    999 5 2000-01-01 2001-04-30

    999 7 2001-05-01 2005-06-30

    999 7 2005-07-01 2006-09-30

    999 5 2006-10-01 2009-10-31

    999 5 2009-11-01 2012-01-01

    Result of doing MIN/MAX:

    Employee DepartmentId MinStartDt MaxEndDt

    ----------- ------------ ---------- ----------

    999 5 2000-01-01 2012-01-01

    999 7 2001-05-01 2006-09-30

    However, Ideally I would get 3 rows (Department 5, then the stay at Department 7, then back to Department 5 again).

    Employee DepartmentId MinStartDt MaxEndDt

    ----------- ------------ ---------- ----------

    999 5 2000-01-01 2001-04-30

    999 7 2001-05-01 2006-09-30

    999 5 2006-10-01 2012-01-01

    I'm missing some obvious here and just not seeing it, if someone could lend a hand I would greatly appreciate it!

    I played with RANK and ROW_NUM hoping I could get it to group / partition on each department and repeat the number for each department so the first department 5 would be RowNum=1, then the next group of Dept 7 would be =2 and the last group of Dept 5 would = 3 and then I could min/max on that grouping, but I didn't have any luck with that approach either.

    Thanks!

    My table:

    CREATE TABLE [dbo].[EmployeeDepartmentHistory](

    [Employee] [int] NOT NULL,

    [DepartmentId] [int] NOT NULL,

    [StartDt] [date] NOT NULL,

    [EndDt] [date] NOT NULL

    ) ON [PRIMARY]

    GO

    INSERT [dbo].[EmployeeDepartmentHistory] ([Employee], [DepartmentId], [StartDt], [EndDt]) VALUES (999, 5, CAST(N'2000-01-01' AS Date), CAST(N'2001-04-30' AS Date))

    GO

    INSERT [dbo].[EmployeeDepartmentHistory] ([Employee], [DepartmentId], [StartDt], [EndDt]) VALUES (999, 7, CAST(N'2001-05-01' AS Date), CAST(N'2005-06-30' AS Date))

    GO

    INSERT [dbo].[EmployeeDepartmentHistory] ([Employee], [DepartmentId], [StartDt], [EndDt]) VALUES (999, 7, CAST(N'2005-07-01' AS Date), CAST(N'2006-09-30' AS Date))

    GO

    INSERT [dbo].[EmployeeDepartmentHistory] ([Employee], [DepartmentId], [StartDt], [EndDt]) VALUES (999, 5, CAST(N'2006-10-01' AS Date), CAST(N'2009-10-31' AS Date))

    GO

    INSERT [dbo].[EmployeeDepartmentHistory] ([Employee], [DepartmentId], [StartDt], [EndDt]) VALUES (999, 5, CAST(N'2009-11-01' AS Date), CAST(N'2012-01-01' AS Date))

    GO

  • Comments in the code.

    -- need to create a data "island" of the times in each department.

    -- to do this, you need a sequential number for each employee,

    -- a sequential number for employee/department, and then a calculation.

    -- for the sequential numbers, we will use the ROW_NUMBER() function,

    -- with different PARTITION BY clauses to generate the data islands

    -- then you take the difference between these values to group the rows together

    WITH cte AS

    (

    SELECT Employee, DepartmentId, StartDt, EndDt,

    ROW_NUMBER() OVER (PARTITION BY Employee ORDER BY StartDt) -

    ROW_NUMBER() OVER (PARTITION BY Employee, DepartmentId ORDER BY StartDt) AS Grp

    FROM [dbo].[EmployeeDepartmentHistory]

    )

    -- now get your min/max, grouping by the calculated column

    SELECT Employee,

    DepartmentId,

    MIN(StartDt) AS MinStartDt,

    MAX(EndDt) AS MaxEndDt

    FROM cte

    GROUP BY Employee, DepartmentId, Grp

    ORDER BY Employee, MinStartDt;

    results:

    Employee DepartmentId MinStartDt MaxEndDt

    ----------- ------------ ---------- ----------

    999 5 2000-01-01 2001-04-30

    999 7 2001-05-01 2006-09-30

    999 5 2006-10-01 2012-01-01

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • AH of course!

    I see, using row number twice and subtracting to find the difference to create the desired output.

    Perfect, thanks!

    That never even occurred to me, I was playing around with multiple CTEs and isolating it and at that point I realized I had just gone off the deep end 🙂

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply