Search code examples
mysqljoingroup-by

How to get the last data of group by in mysql


How can I get the last row of employee_id of each departement_id in mysql?

I have two tables, departements and employees.

departements:

+----------------+------------------+------------+  
| departement_id | departement_name | manager_id |  
+----------------+------------------+------------+  
|             10 | Administration   |        101 |  
|             20 | IT               |        103 |  
+----------------+------------------+------------+  

employees:

+-------------+--------+--------+------------+----------------+  
| employee_id | name   | salary | manager_id | departement_id |  
+-------------+--------+--------+------------+----------------+  
|         100 | Steven |   8000 |        101 |             10 |  
|         101 | Lexa   |  10000 |        101 |             10 |  
|         102 | Bruce  |   9000 |        103 |             20 |  
|         103 | Diana  |  11000 |        103 |             20 |  
|         104 | Bruce  |   8500 |        103 |             20 |  
+-------------+--------+--------+------------+----------------+

when I do this query:

select employee_id,name,employees.departement_id,avg(salary) from employees inner join departements on employees.departement_id=departements.departement_id group by employees.departement_id;

I get this result:

+-------------+--------+----------------+-------------+  
| employee_id | name   | departement_id | avg(salary) |  
+-------------+--------+----------------+-------------+  
|         100 | Steven |             10 |   9000.0000 |  
|         102 | Bruce  |             20 |   9500.0000 |  
+-------------+--------+----------------+-------------+  

Please help me how to get this result:

+-------------+--------+----------------+-------------+  
| employee_id | name   | departement_id | avg(salary) |  
+-------------+--------+----------------+-------------+  
|         101 | Lexa   |             10 |   9000.0000 |  
|         104 | Bruce  |             20 |   9500.0000 |  
+-------------+--------+----------------+-------------+ 

Solution

  • Your current query is permitted by MySQL because ONLY_FULL_GROUP_BY is OFF. You might not have made that choice or be aware that there is a choice. Before MySQL 5.7 it was OFF (either by default or because the option didn't exist) but from 5.7 the option is ON by default, and hence, since 5.7 by it became a choice. I would recommend you turn it back on and live with the restrictions it imposes.

    When using non-standard group by queries (allowed when ONLY_FULL_GROUP_BY is OFF) there are 3 categories of columns:

    1. Grouping columns i.e. those columns specified in the group by clause
    2. "Aggregating" columns i.e. those using SUM/AVG/MIN/MAX/COUNT etc
    3. "Non-Aggregating & Non-Grouping" columns i.e. any column not in 1 or 2

    This third category of column IS A BIG PROBLEM and unless a whole range of special conditions exist what MySQL spits out into these columns is "non-deterministic" (i.e. pretty much random) and not even using an order by column improves this. Read from MySQL's own documentation:

    If ONLY_FULL_GROUP_BY is disabled, a MySQL extension to the standard SQL use of GROUP BY permits the select list, HAVING condition, or ORDER BY list to refer to nonaggregated columns even if the columns are not functionally dependent on GROUP BY columns.

    ... In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are nondeterministic, which is probably not what you want.

    Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause.

    From: 12.19.3 MySQL Handling of GROUP BY

    CREATE TABLE departements (
        departement_id INT PRIMARY KEY,
        departement_name VARCHAR(255),
        manager_id INT
    );
    
     
    CREATE TABLE employees (
        employee_id INT PRIMARY KEY,
        name VARCHAR(255),
        salary INT,
        manager_id INT,
        departement_id INT
    );
    
     
    INSERT INTO departements (departement_id, departement_name, manager_id)
    VALUES
    (10, 'Administration', 101),
    (20, 'IT', 103);
    
    
    INSERT INTO employees (employee_id, name, salary, manager_id, departement_id)
    VALUES
    (100, 'Steven', 8000, 101, 10),
    (101, 'Lexa', 10000, 101, 10),
    (102, 'Bruce', 9000, 100, 20),
    (103, 'Diana', 11000, 103, 20),
    (104, 'Bruce', 8500, 103, 20);
    

    -- force to OFF
    SET SESSION sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''));
    
    
    SELECT
          employee_id               -- "non-aggregating" column
        , name                      -- "non-aggregating" column
        , employees.departement_id  -- grouped by
        , avg(salary)               -- aggregating column
    FROM employees
    INNER JOIN departements ON employees.departement_id = departements.departement_id
    GROUP BY
         employees.departement_id
    ORDER BY
          employee_id               -- "non-aggregating" column  
      ;
    
    employee_id name departement_id avg(salary)
    100 Steven 10 9000.0000
    102 Bruce 20 9500.0000
    SELECT
          employee_id               -- "non-aggregating" column
        , name                      -- "non-aggregating" column
        , employees.departement_id  -- grouped by
        , avg(salary)               -- aggregating column
    FROM employees
    INNER JOIN departements ON employees.departement_id = departements.departement_id
    GROUP BY
         employees.departement_id
    ORDER BY
          employee_id  DESC        -- "non-aggregating" column  
      ;
    
    employee_id name departement_id avg(salary)
    102 Bruce 20 9500.0000
    100 Steven nb: expected 101, 'Lexa' i.e. order by did NOT work 10 9000.0000

    fiddle