I have a table as following:
book_id author_id mark year
1 1 BAD 2014
1 1 MEDIUM 2014
1 1 GREAT 2015
I would like to execute a query that will give me the best book for each author. Something like this:
book_id author_id mark year
1 1 GREAT 2015
I tried to use the distinct keyword on multiple fields - but when I do this:
select distinct book_id, author_id from Books
I get only the book_id and the author_id (as expected) - but I also need the mark and the year - but I cannot add it to the distinct phrase.
Currently I'm using Postgres 9.4 but I need an ANSI-SQL solution.
Is there a way I can do that?
greatest-n-per-group questions are usually solved using window functions:
select *
from (
select book_id, author_id, mark, year,
row_number() over (partition by author_id order by case mark when 'GREAT' then 1 when 'MEDIUM' then 2 else 3 end) as rn
from books
) t
where rn = 1;
The above is standard ANSI SQL, but in Postgres using the (proprietary) distinct on
is usually much faster:
select distinct on (author_id) book_id, author_id, mark, year,
from books
order by author_id,
case mark when 'GREAT' then 1 when 'MEDIUM' then 2 else 3 end