I have a redshift table and I would like to delete records for whose timestamp are too close.
For the following table, I would like to delete line 3 and 4 because date is close to line 1 and other fields are identical.
col1 | col2 | col3
--------------------------
1 02:23 4
51 02:23 29
1 02:22 4
1 02:24 4
1 19:57 4
Hmmm . . . You can use lead()
:
select t.*
from (select t.*,
lead(col2) over (partition by col1, col3 over order by col2) as next_col2
from t
) t
where next_col2 is null or
next_col2 < col2 + interval '1 hour'. -- or whatever your cutoff is