Search code examples
mysqlsqldateselectgreatest-n-per-group

MySQL grouping and sorting


mysql> describe jobs;
+--------------+---------+------+-----+---------+----------------+
| Field        | Type    | Null | Key | Default | Extra          |
+--------------+---------+------+-----+---------+----------------+
| job_id       | int(11) | NO   | PRI | NULL    | auto_increment |
| candidate_id | int(11) | NO   | MUL | NULL    |                |
| company_id   | int(11) | NO   | MUL | NULL    |                |
| start_date   | date    | NO   | MUL | NULL    |                |
| end_date     | date    | NO   | MUL | NULL    |                |
+--------------+---------+------+-----+---------+----------------+
5 rows in set (0.01 sec)

Each candidate can have multiple jobs. I want to find the latest job for each candidate (based on start date, because end date can be 0000-00-00), and check (in PHP) if the end_Date is not 0000-00-00.

(if your last end date is not 0000-00-00, then you are currently unemployed, which is what I am looking for; I don't know how to do this in my query, so will do it in PHP).

The query SELECT candidate_id, end_Date FROM jobs ORDER BY candidate_id , start_date DESC gets me halfway there.

+--------------+------------+
| candidate_id | end_Date   |
+--------------+------------+
|            1 | 2019-08-31 |
|            1 | 2019-01-31 |
|            1 | 2019-05-31 |
|            2 | 0000-00-00 |
|            2 | 2018-02-28 |
|            2 | 2017-05-31 |
|            2 | 2016-09-30 |
|            3 | 0000-00-00 |
|            3 | 2019-05-31 |
|            4 | 2019-04-30 |
|            4 | 2019-09-30 |

(How) can I get only the first entry (row with the most recent start_date) for each candidate Id? And can I get only those where the end date is not 0000-00-00?

(Oops, it looks like my ordering by end_date is not working)


Solution

  • You could use a join on max start_date group by candidate

    select  * 
    from jobs j
    inner join  (
    
      select candidate_id ,  max(start_date) max_start_date 
      from jobs
      group by   candidate_id
    
      ) t on t.candidate_id = j.candidate_id 
        and t.max_start_date = j.start_date