Search code examples
sqlmysqlgreatest-n-per-group

Get value from the same row as an aggregate function (max) in MySQL using SELECT query


I am trying to use a max(..) to get the maximum value from a set, but I want to also get the other row values that correspond to this row. I'm using MySQL 8

SELECT user_id,
    max(login_time) as last_login_time,
    task_name as last_task_worked_on 
        FROM user_tasks
            WHERE user_id IN (123, 456, 789) group by user_id;

Here is the DB schema & some sample values that I'm working with:

create table user_tasks (
        user_id int(11), 
        login_time datetime, 
        task_name varchar(50) )

insert into user_tasks values (123, '2023-01-30 06:10:03', 'walk dog');
insert into user_tasks values (123, '2023-02-30 06:10:03', 'bathe dog');
insert into user_tasks values (123, '2023-05-30 06:10:03', 'feed dog');

insert into user_tasks values (456, '2023-03-30 06:10:03', 'feed dog');
insert into user_tasks values (456, '2023-02-30 06:10:03', 'walk dog');
insert into user_tasks values (456, '2023-01-30 06:10:03', 'bathe dog');

insert into user_tasks values (789, '2023-03-30 06:10:03', 'feed dog');
insert into user_tasks values (789, '2023-01-30 06:10:03', 'walk dog');
insert into user_tasks values (789, '2023-05-30 06:10:03', 'bathe dog');

insert into user_tasks values (898, '2023-05-30 06:10:03', 'walk dog');
insert into user_tasks values (900, '2023-05-30 06:10:03', 'bathe dog');

The above SELECT statement yields this result:

'123','2023-05-30 06:10:03','walk dog'
'456','2023-03-30 06:10:03','feed dog'
'789','2023-05-30 06:10:03','feed dog'

However, the first value should be "feed dog" not "walk dog", and the last value should be "bathe dog" not "feed dog".

The middle value is correct, but only because "feed dog" happens to correspond to the first row, which is also the max(login_time) value.

The output that I want is this:

'123','2023-05-30 06:10:03','feed dog'
'456','2023-03-30 06:10:03','feed dog'
'789','2023-05-30 06:10:03','bathe dog'

Solution

  • Here how to do it.

    First get latest login_time per user_id :

    SELECT user_id, max(login_time) as last_login_time
    FROM user_tasks
    WHERE user_id IN (123, 456, 789) 
    GROUP BY user_id
    

    Then join this dataset with your table to get the expected output :

    SELECT u.*
    FROM user_tasks u
    INNER JOIN (
      SELECT user_id, max(login_time) as last_login_time
      FROM user_tasks
      WHERE user_id IN (123, 456, 789) 
      group by user_id
    ) AS s ON s.user_id = u.user_id and s.last_login_time = u.login_time;
    

    Results :

    user_id login_time          task_name
    123     2023-05-30 06:10:03 feed dog
    456     2023-03-30 06:10:03 feed dog
    789     2023-05-30 06:10:03 bathe dog