I have data like below:
select study_id , updated_by ,created_at
from my_table ps
where study_id = '1';
I want sort the records in Descending order based on created_at
and pick distinct study_id
and updated_by
.
I tried below:
I've ran into a weird issue here in Redshift
. Please consider the below query:
select study_id , updated_by
from my_table ps
where study_id = '1'
ORDER BY created_at DESC ;
This results in:
But I need to pick only Distinct
records. So, I used this query:
select DISTINCT study_id , updated_by
from my_table ps
where study_id = '1'
ORDER BY created_at DESC ;
This results in:
As you can see, the record with maya2
is now being shown as latest instead of maya1
.
Why does the sorting breaks with DISTINCT
? How can I fix this?
so in your data set only maya1 is duplicated and maya1 holding the timestamp 7:31 and 7:29 but maya2 is holding 7:30 , so when you are using distinct then query engine remove one maya1 which is holding 7:31 as a result maya2 is getting the top posintion
if you need latest then simply use max
select study_id , updated_by ,max(created_at) as created_at
from my_table ps
where study_id = '1'
group by study_id , updated_by
ORDER BY created_at DESC
if you need only study_id , updated_by then use row_number()
select select study_id , updated_by from
( select study_id , updated_by ,
row_number() over(partition by study_id , updated_by ORDER BY created_at DESC ) rn
from my_table ps
where study_id = '1'
) a where a.rn=1