I have a dataset with duplicate IDs, different scores, and different days like this
I want one row per ID based on the maximum value of day. This is the output I want
The query I currently have is not deduplicating because there are multiple scores per ID. I am trying to use a subquery in the where clause like this
select distinct
ID,
score,
day
from
table A
where
day in (select max(day)
from table A
group by ID)
The subquery obviously returns only one ID per row, but when I add in the score, I am still getting duplicates.
Any ideas are greatly appreciated!
The following uses a common table expression, which is supported by most DBMS, for getting the max day per ID. Then join to it on both ID and Day.
with maxCTE as (
select id, max(day) as max_day
from table1
group by id
)
select t.*
from table1 t
join maxCTE m
on t.id = m.id
and t.day = m.max_day