I have the following sample data with 3 columns, i.e., account_id, interval_start, interval_end. And I want to create a reactivated column with the given output in the following table.
For each account_id, reactivated column should start with a 0 value and then I want to check if the previous interval_end and current interval_start value is more than a month then I want to increment that value as in to 1 and then next time when previous interval_end and current interval_start is more than a month, then it should be 2 and so on.
account_id | interval_start | interval_end | reactivated |
---|---|---|---|
ABC | 2015-01-01 00:00:00.000 | 2015-02-01 00:00:00.000 | 0 |
ABC | 2015-02-01 00:00:00.000 | 2015-03-01 00:00:00.000 | 0 |
ABC | 2015-03-01 00:00:00.000 | 2015-04-01 00:00:00.000 | 0 |
ABC | 2015-09-01 00:00:00.000 | 2015-10-01 00:00:00.000 | 1 |
ABC | 2015-10-01 00:00:00.000 | 2015-11-01 00:00:00.000 | 1 |
ABC | 2017-04-01 00:00:00.000 | 2017-05-01 00:00:00.000 | 2 |
ABC | 2017-05-01 00:00:00.000 | 2017-06-01 00:00:00.000 | 2 |
ABC | 2017-06-01 00:00:00.000 | 2017-07-01 00:00:00.000 | 2 |
ABC | 2017-07-01 00:00:00.000 | 2017-08-01 00:00:00.000 | 2 |
ABC | 2019-03-01 00:00:00.000 | 2019-04-01 00:00:00.000 | 3 |
ABC | 2019-04-01 00:00:00.000 | 2019-05-01 00:00:00.000 | 3 |
ABC | 2019-05-01 00:00:00.000 | 2019-06-01 00:00:00.000 | 3 |
ABC | 2021-07-01 00:00:00.000 | 2021-08-01 00:00:00.000 | 4 |
ABC | 2021-08-01 00:00:00.000 | 2021-09-01 00:00:00.000 | 4 |
DEF | 2015-03-01 00:00:00.000 | 2015-04-01 00:00:00.000 | 0 |
DEF | 2015-04-01 00:00:00.000 | 2015-05-01 00:00:00.000 | 0 |
DEF | 2017-06-01 00:00:00.000 | 2017-07-01 00:00:00.000 | 1 |
DEF | 2017-07-01 00:00:00.000 | 2017-08-01 00:00:00.000 | 1 |
DEF | 2019-05-01 00:00:00.000 | 2019-06-01 00:00:00.000 | 2 |
So far I'm only able to get 0s and 1s in the reactivated column by doing the following but its not what I want:
case when
interval_start - lag(interval_end) over (partition by account_id order by interval_start) as prev_interval_end > 30 then 1 else 0 end
How do I get the value to increment as well for each account_id please.
Adding to above answer. Since I'm using Redshift what worked for me is:
sum(case when interval_start > lag_interval_end then 1 else 0 end) over (partition by account_id order by interval_start rows unbounded preceding) as grp