Search code examples
mysqlgreatest-n-per-group

MySQL select unique rows in two columns with the highest value in one column


I have a basic table:

+-----+--------+------+------+
| id, | name,  | cat, | time |
+-----+--------+------+------+
|   1 | jamie  |    1 |  100 |
|   2 | jamie  |    2 |  100 |
|   3 | jamie  |    1 |   50 |
|   4 | jamie  |    2 |  150 |
|   5 | bob    |    1 |  100 |
|   6 | tim    |    1 |  300 |
|   7 | alice  |    4 |  100 |
+-----+--------+------+------+

I tried using the "Left Joining with self, tweaking join conditions and filters" part of this answer: SQL Select only rows with Max Value on a Column but some reason when there are records with a value of 0 it breaks, and it also doesn't return every unique answer for some reason.

When doing the query on this table I'd like to receive the following values:

+-----+--------+------+------+
| id, | name,  | cat, | time |
+-----+--------+------+------+
|   1 | jamie  |    1 |  100 |
|   4 | jamie  |    2 |  150 |
|   5 | bob    |    1 |  100 |
|   6 | tim    |    1 |  300 |
|   7 | alice  |    4 |  100 |
+-----+--------+------+------+

Because they are unique on name and cat and have the highest time value.

The query I adapted from the answer above is:

SELECT a.name, a.cat, a.id, a.time
FROM data A
INNER JOIN (
    SELECT name, cat, id, MAX(time) as time
    FROM data
    WHERE extra_column = 1
    GROUP BY name, cat
) b ON a.id = b.id AND a.time = b.time

Solution

  • The issue here is that ID is unique per row you can't get the unique value when getting the max; you have to join on the grouped values instead.

    SELECT a.name, a.cat, a.id, a.time
    FROM data A
    INNER JOIN (
        SELECT name, cat, MAX(time) as time
        FROM data
        WHERE extra_column = 1
        GROUP BY name, cat
    ) b ON A.Cat = B.cat and A.Name = B.Name AND a.time = b.time
    

    Think about it... So what ID is mySQL returning form the Inline view? It could be 1 or 3 and 2 or 4 for jamie. Hows does the engine know to pick the one with the max ID? it is "free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. " it could pick the wrong one resulting in incorrect results. So you can't use it to join on.

    https://dev.mysql.com/doc/refman/5.0/en/group-by-handling.html