Search code examples
sqlpostgresqlwindow-functionspostgresql-performance

PostgreSQL window function: partition with condition


I'm trying to create a query with fields:

  • data - Date sequence
  • user_id - User ID
  • is_paid - Did the user pay this month
  • number_of_not_paid - the number of months since the last payment, if the payment is in this month, then 0
select
   date,
   user_id,
   is_paid,
   (case when is_paid
       then 0
       else sum(case when is_paid then 0 else 1 end) over (partition by user_id order by date)
   end)
from data

The result I get is:

date user_id is_paid num
2020-01-01 1 true 0
2020-02-01 1 false 1
2020-03-01 1 false 2
2020-04-01 1 true 0
2020-05-01 1 false 3
2020-06-01 1 true 0

And the result I would like to get is:

date user_id is_paid num
2020-01-01 1 true 0
2020-02-01 1 false 1
2020-03-01 1 false 2
2020-04-01 1 true 0
2020-05-01 1 false 1
2020-06-01 1 true 0

How I can fix my query for correct result?


Solution

  • You want to reset the delinquency timer every time the user pays, so first mark each time the user pays (boolean can be cast to int to be summed):

    with runs as (
      select date, user_id, is_paid, 
             sum(is_paid::int) over (partition by user_id
                                         order by date) as run_number
        from my_table
    )
    

    With these runs marked, you can then sum the preceding false values within the (user_id, run_number) window:

    select date, user_id, is_paid,
           sum((not is_paid)::int) over (partition by user_id, run_number
                                             order by date) as num
      from runs;
    
     date       | user_id | is_paid | num
     :--------- | ------: | :------ | --:
     2020-01-01 |       1 | t       |   0
     2020-02-01 |       1 | f       |   1
     2020-03-01 |       1 | f       |   2
     2020-04-01 |       1 | t       |   0
     2020-05-01 |       1 | f       |   1
     2020-06-01 |       1 | t       |   0
    
    

    db<>fiddle here