This is a table of user statuses over the period of 9/1/2021 to 9/10/2021. 1
means "active." 0
means "canceled."
date | user | status |
---|---|---|
9/1/2021 | 1 | 1 |
9/1/2021 | 2 | 0 |
9/1/2021 | 3 | 1 |
9/2/2021 | 1 | 1 |
9/2/2021 | 2 | 1 |
9/2/2021 | 3 | 1 |
9/3/2021 | 1 | 0 |
9/3/2021 | 2 | 1 |
9/3/2021 | 3 | 1 |
9/4/2021 | 1 | 0 |
9/4/2021 | 2 | 1 |
9/4/2021 | 3 | 1 |
9/5/2021 | 1 | 0 |
9/5/2021 | 2 | 1 |
9/5/2021 | 3 | 0 |
9/6/2021 | 1 | 1 |
9/6/2021 | 2 | 1 |
9/6/2021 | 3 | 0 |
9/7/2021 | 1 | 1 |
9/7/2021 | 2 | 1 |
9/7/2021 | 3 | 0 |
9/8/2021 | 1 | 0 |
9/8/2021 | 2 | 1 |
9/8/2021 | 3 | 1 |
9/9/2021 | 1 | 0 |
9/9/2021 | 2 | 1 |
9/9/2021 | 3 | 1 |
9/10/2021 | 1 | 1 |
9/10/2021 | 2 | 0 |
9/10/2021 | 3 | 1 |
I want to get the start and end date for each user's active and canceled periods during this time. I know this involves a window function, but I can't quite figure out how to do it. This is my desired output:
user | status | start date | end date |
---|---|---|---|
1 | 1 | 9/1/2021 | 9/2/2021 |
1 | 0 | 9/3/2021 | 9/5/2021 |
1 | 1 | 9/6/2021 | 9/7/2021 |
1 | 0 | 9/8/2021 | 9/9/2021 |
1 | 1 | 9/10/2021 | 9/10/2021 |
2 | 0 | 9/1/2021 | 9/1/2021 |
2 | 1 | 9/2/2021 | 9/9/2021 |
2 | 0 | 9/10/2021 | 9/10/2021 |
3 | 1 | 9/1/2021 | 9/4/2021 |
3 | 0 | 9/5/2021 | 9/7/2021 |
3 | 1 | 9/8/2021 | 9/10/2021 |
I was able to figure it out.
The key components are filtering for when the current status does not equal the previous status. This indicates a date when the status of the user changes.
When you filter for these rows, you can just use the LEAD()
window function and subtract 1 day to get the end date for that status.
with win as
(
select
usr
, dt
, lag(status) over (partition by usr order by dt) as prev_status
, status
from subs
)
select
usr
, status
, dt as start_date
, coalesce(lead(dt) over (partition by usr order by dt) - interval '1 day', (select max(dt) from win)) as end_date
from win
where
status <> prev_status
or prev_status is null