sql amazon-redshift amazon-redshift-spectrum sql-rank

Redshift get rows with highest priority and timestamp

I have a table in redshift with records as follows


+====+======+=========+============+
| Id | Book | Action  | Timestamp  |
+====+======+=========+============+
| 1  | ABC  | ADDED   | 1111111111 |
+----+------+---------+------------+
| 1  | ABC  | REMOVED | 2222222222 |
+----+------+---------+------------+
| 1  | ABC  | CHECKED | 3333333333 |
+----+------+---------+------------+
| 2  | XYZ  | ADDED   | 4444444444 |
+----+------+---------+------------+
| 2  | XYZ  | CHECKED | 5555555555 |
+----+------+---------+------------+
| 5  | DEF  | CHECKED | 6666666666 |
+----+------+---------+------------+
| 5  | DEF  | CHECKED | 7777777777 |
+====+======+=========+============+

What I want is to get single record for each book. In case of duplicates, prioritize by Action with CHECKED having the lowest rank, rest all actions will have same priority and then pick the most recent record.

Expected Output


+====+======+=========+============+
| Id | Book | Action  | Timestamp  |
+====+======+=========+============+
| 1  | ABC  | REMOVED | 2222222222 |
+----+------+---------+------------+
| 2  | XYZ  | ADDED   | 4444444444 |
+----+------+---------+------------+
| 5  | DEF  | CHECKED | 7777777777 |
+====+======+=========+============+

Solution

A simple method uses row_number():

select t.*
from (select t.*,
             row_number() over (partition by id
                                order by (case action when 'Checked' then 1 else 2 end) desc,
                                         timestamp desc
                               ) as seqnum
      from t
     ) t
where seqnum = 1;