Search code examples
sqldatetimesubquerymaxgreatest-n-per-group

Finding rows with MAX value after using GROUP BY


I have a table with 4 columns: dept_no, emp_no, from_date, to_date, where for each dept_no, emp_no is the manager.

I want to find current manager using to_date, and I want to display dept_no and emp_no.

Sample Data:

|emp_no  |dept_no  | from_date  |   to_date  |
| 11     | d001    | 1985-01-01 | 1991-10-01 |
| 12     | d001    | 1991-10-01 | 9999-01-01 |
| 21     | d002    | 1985-01-01 | 1989-12-17 |
| 22     | d002    | 1989-12-17 | 9999-01-01 |
| 31     | d003    | 1985-01-01 | 1992-03-21 |
| 32     | d003    | 1992-03-21 | 9999-01-01 |

Sample Output:

|emp_no   |dept_no  |
|12       |d001     |
|22       |d002     |
|32       |d003     |

I figured out this:

SELECT dept_no
     , emp_no 
  FROM 
     ( SELECT dept_no
            , MAX(to_date) as cur 
         FROM dept_manager 
        GROUP 
           BY dept_no) as new 
  JOIN dept_manager using(dept_no) 
 where cur = to_date;

I'm finding the MAX(to_date) for each department and then using it in WHERE clause.

This works but I feel that there should be better ways to do this.

I have seen many similar questions but none helped me as I want to display a column that can't be used in group by.


Solution

  • A portable and usually efficient approach is to filter with a subquery:

    select dept_no, emp_no
    from dept_manager d
    where to_date = (select max(d1.to_date) from dept_manager d1 where d1.dept_no = d.dept_no)
    

    For performance with this query, you want an index on (dept_no, to_date).

    Another common approach is window functions:

    select *
    from (
        select d.*, row_number() over(partition by dept_no order by to_date desc) rn
        from dept_manager d
    ) d
    where rn = 1
    

    Depending on your database and version, there may be neater alternatives.