Search code examples
sqlpostgresqlcumulative-sum

Generate counts of users per payments given payments per user


I have the following sample data:

product payments instances
Professional 3 1
Professional 4 1
Starter 1 29
Starter 2 8
Starter 3 4
Team 1 1
Team 2 2

whereas:

  • "instances" stands for users
  • "product" stands for made plans
  • "payments" are just occurred payments

I need to transform this from a count of users who have made X payments to the amount of users that made X payments.

Expected result:

product payments instances
Professional 1 2
Professional 2 2
Professional 3 2
Professional 4 1
Starter 1 41
Starter 2 12
Starter 3 4
Starter 4 0
Team 1 3
Team 2 2
Team 3 0
Team 4 0

The difference is that now, users who have made (for example) 2 payments are also counted within the number of users who have made one payment. Because logically, if you've made 2 payments, you've also made at least one payment.

Here's where I'm at:

with pay_counts as (
    {{#102}}
),
plan_pay_counts as (
    select
        plan,
        payments,
        count(plan) as number_of_users
    from 
        pay_counts
    group by
        plan, payments
),
i as (
    SELECT 
        generate_series(1,max(payments)) as payments
    FROM 
        pay_counts
)
select * from plan_pay_counts
order by plan, payments;

In theory, i will be the column expressing the number of payments.


Solution

  • An optional way of approaching this problem is by:

    • selecting unique products
    • generating series of possible amounts for payments
    • left-joining the combination of products and payments, with your original table, on matching products and payments
    • computing a reversed running sum of instances, from the highest payment to the lowest, for each product partition
    WITH products AS (
        SELECT DISTINCT product FROM tab
    ), payments AS (
        SELECT payment
        FROM GENERATE_SERIES(1, (SELECT MAX(payments) FROM tab)) AS payment 
    )
    SELECT p1.product,
           p2.payment,
           COALESCE(SUM(tab.instances) OVER(PARTITION BY p1.product 
                                            ORDER     BY p2.payment DESC), 0) AS instances
    FROM       products p1
    CROSS JOIN payments p2
    LEFT JOIN tab 
           ON p1.product = tab.product 
          AND p2.payment = tab.payments
    ORDER BY product, payment
    

    Output:

    product payment instances
    Professional 1 2
    Professional 2 2
    Professional 3 2
    Professional 4 1
    Starter 1 41
    Starter 2 12
    Starter 3 4
    Starter 4 0
    Team 1 3
    Team 2 2
    Team 3 0
    Team 4 0

    Check the demo here.