I want to filter out records where there is a record with a start_time
before than start_time
of created status
for each id
.
e.g. below id A has 'failed' status before 'created' based on start_time
. So it needs to be filtered.
Whereas id B has first 'created', then another status which is acceptable.
So the expected result is only this, but I'm looking for a scalable solution that works for thousands of rows.
WITH t1 AS (
SELECT 'A' AS id, 'failed' AS status, '2021-05-18 18:30:00'::timestamp AS start_time UNION ALL
SELECT 'A' AS id, 'created' AS status, '2021-05-24 11:30:00'::timestamp AS start_time UNION ALL
SELECT 'A' AS id, 'created' AS status, '2021-05-24 12:00:00'::timestamp AS start_time UNION ALL
SELECT 'B' AS id, 'created' AS status, '2021-05-19 18:30:00'::timestamp AS start_time UNION ALL
SELECT 'B' AS id, 'successful' AS status, '2021-05-20 11:30:00'::timestamp AS start_time
)
SELECT *
FROM t1
There are multiple ways to achieve this but here is an approach using first_value
.
with t1 (id, status, start_time) as
(select 'a', 'failed', '2021-05-18 18:30:00'::timestamp union all
select 'a', 'created', '2021-05-24 11:30:00'::timestamp union all
select 'a', 'created', '2021-05-24 12:00:00'::timestamp union all
select 'b', 'created', '2021-05-19 18:30:00'::timestamp union all
select 'b', 'successful', '2021-05-20 11:30:00'::timestamp)
select *
from t1
qualify first_value(status) over (partition by id order by start_time asc) = 'created'
All you're doing is making sure that the first status for any given id is "created". Think of qualify
clause as having
clause for window functions
. You can also break this down into a subquery if you find that more readable.
Note: Solution above will also keep records that have only "created" status. If you want to make sure there are least two different statuses per id, modify it to
select *
from t1
qualify first_value(status) over (partition by id order by start_time asc) = 'created'
and
count(distinct status) over (partition by id) > 1;