I'm trying to run a SQL query that seems easy but I don't understand what mechanism I should use to achieve this. I have the following table:
| id | playcount | release_date |
| -- | --------- | ------------ |
| 1 | 47712 | 2019-12-27 |
| 2 | 626778 | 2017-07-23 |
| 3 | 827091 | 2019-09-12 |
| 4 | 66419 | 2015-09-05 |
| 5 | 58627 | 2016-09-12 |
| 6 | 60272 | 2017-09-06 |
| 7 | 1762582 | 2017-10-07 |
| 8 | 11280 | 2016-10-12 |
| 9 | 30890 | 2019-10-29 |
| 10 | 715173 | 2019-07-02 |
I would like to find which id holds the maximum playcount
per year.
In this example, the id that has the maximum playcount
in 2019 is 3
.
And I'm looking for the following output:
playcount_table
| year | playcount | id |
| ---- | --------- | -- |
| 2019 | 827091 | 3 |
| 2017 | 1762582 | 7 |
| 2016 | 58627 | 5 |
| 2015 | 66419 | 4 |
I succeeded to get the maximum playcount by year with the following query:
select to_char(date_trunc('year', TO_DATE(p_table.release_date, 'YYYY-MM-DD')), 'YYYY-MM-DD') as year,
max(p_table.playcount) as playcounts
from playcount_table as p_table
group by year;
But I can't retrieve the id
related to this max playcount
.
Can you help me with this ?
Thank you in advance,
Using ROW_NUMBER
:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY EXTRACT(year FROM release_date)
ORDER BY playcount DESC) rn
FROM playcount_table
)
SELECT
EXTRACT(year FROM release_date) year
playcount,
id
FROM cte
WHERE
rn = 1
ORDER BY
EXTRACT(year FROM release_date) DESC;
If a given year could have two or more records tied for the greatest play count, and you would want to report both of them, then replace ROW_NUMBER
above with RANK
.
We might also be able to use DISTINCT ON
here:
SELECT DISTINCT ON (EXTRACT(year FROM release_date))
EXTRACT(year FROM release_date),
playcount,
id
FROM playcount_table
ORDER BY
EXTRACT(year FROM release_date),
playcount DESC;