Search code examples
sqlaggregate-functionswindow-functions

SQL status changes with start and end dates


This is a table of user statuses over the period of 9/1/2021 to 9/10/2021. 1 means "active." 0 means "canceled."

date user status
9/1/2021 1 1
9/1/2021 2 0
9/1/2021 3 1
9/2/2021 1 1
9/2/2021 2 1
9/2/2021 3 1
9/3/2021 1 0
9/3/2021 2 1
9/3/2021 3 1
9/4/2021 1 0
9/4/2021 2 1
9/4/2021 3 1
9/5/2021 1 0
9/5/2021 2 1
9/5/2021 3 0
9/6/2021 1 1
9/6/2021 2 1
9/6/2021 3 0
9/7/2021 1 1
9/7/2021 2 1
9/7/2021 3 0
9/8/2021 1 0
9/8/2021 2 1
9/8/2021 3 1
9/9/2021 1 0
9/9/2021 2 1
9/9/2021 3 1
9/10/2021 1 1
9/10/2021 2 0
9/10/2021 3 1

I want to get the start and end date for each user's active and canceled periods during this time. I know this involves a window function, but I can't quite figure out how to do it. This is my desired output:

user status start date end date
1 1 9/1/2021 9/2/2021
1 0 9/3/2021 9/5/2021
1 1 9/6/2021 9/7/2021
1 0 9/8/2021 9/9/2021
1 1 9/10/2021 9/10/2021
2 0 9/1/2021 9/1/2021
2 1 9/2/2021 9/9/2021
2 0 9/10/2021 9/10/2021
3 1 9/1/2021 9/4/2021
3 0 9/5/2021 9/7/2021
3 1 9/8/2021 9/10/2021

Solution

  • I was able to figure it out.

    The key components are filtering for when the current status does not equal the previous status. This indicates a date when the status of the user changes.

    When you filter for these rows, you can just use the LEAD() window function and subtract 1 day to get the end date for that status.

    with win as
    (
    select
      usr
      , dt
      , lag(status) over (partition by usr order by dt) as prev_status
      , status
    from subs
    )
    select
      usr
      , status
      , dt as start_date 
      , coalesce(lead(dt) over (partition by usr order by dt) - interval '1 day', (select max(dt) from win)) as end_date
    from win
    where
      status <> prev_status
      or prev_status is null