I have a table named PERSON in an HSQL database like this:
NAME(PK) | AGE | CITY | ... many more here ... |
--------------------------------------------------
aaa | 12 | nyc | ...
bbb | 12 | nyc | ...
ccc | 10 | nyc | ...
ddd | 34 | la | ...
eee | 10 | la | ...
For each city I need to select the record with the maximum age. If for a given city there are multiple records with tied maximum age I still need to select exactly one record for this city (but it can be arbitrarily chosen).
So in the above example I need this result:
NAME(PK) | AGE | CITY | ... many more here ... |
--------------------------------------------------
aaa | 12 | nyc | ...
ddd | 34 | la | ...
and it would be ok if I got bbb instead of aaa, but not ok to get aaa and bbb.
Simply using group by on the city column and max(age) as aggregate function does not work, because this does not allow me to select other columns than age and city as they are not in the aggregate. I tried doing the group by and then joining the result back to the table, but this way I do not manage to get rid of records with duplicate max age. This query:
SELECT NAME, CITY, AGE, [... many more here ...]
FROM (
SELECT max(age) AS maxAge, city
FROM PERSON
GROUP BY CITY
) AS x
JOIN PERSON AS p
ON p.city = x.city AND p.age = x.maxAge
yields:
NAME(PK) | AGE | CITY | ... many more here ... |
--------------------------------------------------
aaa | 12 | nyc | ...
bbb | 12 | nyc | ...
ddd | 34 | la | ...
whith two records for nyc where there should be only one.
A modern SQL alternative to the correlated subquery solution is the LATERAL keyword:
SELECT * FROM
(SELECT DISTINCT CITY FROM PERSON) CITIES,
LATERAL
(SELECT * FROM PERSON WHERE CITY = CITIES.CITY ORDER BY AGE DESC LIMIT 1)