MIN and MAX dates for groups of rows

  • I have a dataset that contains an EmployeeID, StartDate, EndDate, and Location. Each record tells me at which location and employee was during a payroll period (fortnightly). So the data looks like:

    EMP_KEYSTART_DTEND_DTLOCATION

    120130117201301318103

    120130117201301318103

    120130131201302143354

    120130131201302148103

    220130117201301311234

    220130131201302144567

    120130214201302283354

    220130214201302281234

    Employees can be at multiple locations during the two weeks. They can work at one location, stop working there, start working somewhere else, and then maybe go back to their old location. There are duplicate records here as each employee can be at the same location multiple times during the two week period. What I need to capture is the actual start and end date of an employee at each location for each 'assignment'. An assignment is defined as a continual period of employment at a location with gaps of no less than 4 days - if there is a gap of more than four days then that is classed as a new assignment.

    I have done a lot of experimenting with MIN, MAX, PARTITON and joins however cannot get this right.

    Any advice would be much appreciated.

    Thanks,

    Matt

  • Search web for Itzik Ben-Gan Gaps Islands SQL Server. He has numerous blog posts/articles on SQL Server Mag that will help you here.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • I've taken the liberty of putting your sample data into a consumable form, which if you read up on forum posting etiquette (links abound for this) you'll find will be much easier for your intrepid helpers to offer you tested code rather than links to look into to help yourself.

    WITH SampleData (EMP_KEY, START_DT, END_DT, LOCATION) AS

    (

    SELECT 1,'20130117','20130131',8103

    UNION ALL SELECT 1,'20130117','20130131',8103

    UNION ALL SELECT 1,'20130131','20130214',3354

    UNION ALL SELECT 1,'20130131','20130214',8103

    UNION ALL SELECT 2,'20130117','20130131',1234

    UNION ALL SELECT 2,'20130131','20130214',4567

    UNION ALL SELECT 1,'20130214','20130228',3354

    UNION ALL SELECT 2,'20130214','20130228',1234

    )

    SELECT *

    FROM SampleData;

    Now, I'm not entirely convinced this is a Gaps/Islands problem instead of a simple GROUP BY. None of your sample data overlaps except at the end points. You may need to provide more sample data to illustrate your case, but I highly recommend that you at least show the expected results from the SampleData set.

    One example that seems to be missing in your sample data is the case of someone not being at a site for 4 days (or whatever that business rule was).


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • I think it has to be a Gaps & Islands problem, based on this statement if for no other reason:

    An assignment is defined as a continual period of employment at a location with gaps of no less than 4 days - if there is a gap of more than four days then that is classed as a new assignment.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Apologies for the poor forum etiquette and thanks for taking the time to reply - I shall now make a hasty exit to the Gaps and Islands page with my tail firmly between my legs.

  • TheSQLGuru (1/21/2014)


    I think it has to be a Gaps & Islands problem, based on this statement if for no other reason:

    An assignment is defined as a continual period of employment at a location with gaps of no less than 4 days - if there is a gap of more than four days then that is classed as a new assignment.

    Kevin - You certainly could be right, but that case didn't appear in his sample data and without expected results it's pretty difficult to tell for sure.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (1/21/2014)


    TheSQLGuru (1/21/2014)


    I think it has to be a Gaps & Islands problem, based on this statement if for no other reason:

    An assignment is defined as a continual period of employment at a location with gaps of no less than 4 days - if there is a gap of more than four days then that is classed as a new assignment.

    Kevin - You certainly could be right, but that case didn't appear in his sample data and without expected results it's pretty difficult to tell for sure.

    Plus it's much more interesting if it is G&I!! 😎

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (1/21/2014)


    dwain.c (1/21/2014)


    TheSQLGuru (1/21/2014)


    I think it has to be a Gaps & Islands problem, based on this statement if for no other reason:

    An assignment is defined as a continual period of employment at a location with gaps of no less than 4 days - if there is a gap of more than four days then that is classed as a new assignment.

    Kevin - You certainly could be right, but that case didn't appear in his sample data and without expected results it's pretty difficult to tell for sure.

    Plus it's much more interesting if it is G&I!! 😎

    That's kind of why I joined in for the fun! Just in case it is.


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply