I'm trying to create a query with fields:
select
date,
user_id,
is_paid,
(case when is_paid
then 0
else sum(case when is_paid then 0 else 1 end) over (partition by user_id order by date)
end)
from data
The result I get is:
date | user_id | is_paid | num |
---|---|---|---|
2020-01-01 | 1 | true | 0 |
2020-02-01 | 1 | false | 1 |
2020-03-01 | 1 | false | 2 |
2020-04-01 | 1 | true | 0 |
2020-05-01 | 1 | false | 3 |
2020-06-01 | 1 | true | 0 |
And the result I would like to get is:
date | user_id | is_paid | num |
---|---|---|---|
2020-01-01 | 1 | true | 0 |
2020-02-01 | 1 | false | 1 |
2020-03-01 | 1 | false | 2 |
2020-04-01 | 1 | true | 0 |
2020-05-01 | 1 | false | 1 |
2020-06-01 | 1 | true | 0 |
How I can fix my query for correct result?
You want to reset the delinquency timer every time the user pays, so first mark each time the user pays (boolean
can be cast to int
to be summed):
with runs as (
select date, user_id, is_paid,
sum(is_paid::int) over (partition by user_id
order by date) as run_number
from my_table
)
With these runs marked, you can then sum the preceding false
values within the (user_id, run_number)
window:
select date, user_id, is_paid,
sum((not is_paid)::int) over (partition by user_id, run_number
order by date) as num
from runs;
date | user_id | is_paid | num
:--------- | ------: | :------ | --:
2020-01-01 | 1 | t | 0
2020-02-01 | 1 | f | 1
2020-03-01 | 1 | f | 2
2020-04-01 | 1 | t | 0
2020-05-01 | 1 | f | 1
2020-06-01 | 1 | t | 0
db<>fiddle here