Search code examples
mysqlrecords

MySQL: get oldest record from most recent group


Sorry for the confusing title, but it's the best way to explain it. This is not a usual "most recent from group" problem and I haven't been able to find anything similar on the web.

I have a status table that tracks what people are doing at various work sites. It contains records that link people, status and location.

ID, start_date, person_ID, location_ID, status
1,  2014-10-12, 1,         1,           job a
2,  2014-10-13, 2,         2,           job b
3,  2014-10-15, 1,         3,           job c
4,  2014-10-21, 1,         3,           job d
5,  2014-10-22, 2,         4,           job a
6,  2014-10-26, 2,         2,           job d

I need to be able to determine how long each person as been at the current site - I'm hoping to get results like this:

person_ID, location_ID, since
1,         3,           2014-10-15
2,         2,           2014-10-26

Getting when they started the current job is relatively easy by joining the max(start_date), but I need the min(start_date) from the jobs done at the most recent location.

I have been trying to join the min(start_date) within the records that match the current location (from the most recent record), and that works great until I have a person (like person 2) who has multiple visits to the current location... you can see in my desired results that I want the 10-26 date, not the 10-13 which is the first time they were at the site.

I need some method for matching the the job records for a given person, and then iterating back until the location doesn't match. I'm figuring there has to be some way to do this with some sub-queries and some clever joins, but I haven't been able to find it yet, so I would appreciate some help.


Solution

  • If I understand what you're asking correctly, you could use EXISTS to eliminate all but the most recent locations per person, and get the min date from the resulting rows.

    SELECT person_id, location_id, MIN(start_date) since
    FROM status s
    WHERE NOT EXISTS (
      SELECT 1 FROM status 
      WHERE s.person_id = person_id 
        AND s.location_id <> location_id
        AND s.start_date < start_date)
    GROUP BY person_id
    

    An SQLfiddle to test with.

    Basically, it eliminates all locations and times where the same person has visited another location more recently. For example;

    1,  2014-10-12, 1,         1,           job a
    

    ...is eliminated since person 1 has visited location 3 more recently, while;

    3,  2014-10-15, 1,         3,           job c
    

    ...is kept since the same person has only visited the same location more recently.

    It then just picks the least recent time per person. Since only the rows from the last location are kept, it will be the least recent time from the most recent location.