I have the following sample data:
product | payments | instances |
---|---|---|
Professional | 3 | 1 |
Professional | 4 | 1 |
Starter | 1 | 29 |
Starter | 2 | 8 |
Starter | 3 | 4 |
Team | 1 | 1 |
Team | 2 | 2 |
whereas:
I need to transform this from a count of users who have made X payments to the amount of users that made X payments.
Expected result:
product | payments | instances |
---|---|---|
Professional | 1 | 2 |
Professional | 2 | 2 |
Professional | 3 | 2 |
Professional | 4 | 1 |
Starter | 1 | 41 |
Starter | 2 | 12 |
Starter | 3 | 4 |
Starter | 4 | 0 |
Team | 1 | 3 |
Team | 2 | 2 |
Team | 3 | 0 |
Team | 4 | 0 |
The difference is that now, users who have made (for example) 2 payments are also counted within the number of users who have made one payment. Because logically, if you've made 2 payments, you've also made at least one payment.
Here's where I'm at:
with pay_counts as (
{{#102}}
),
plan_pay_counts as (
select
plan,
payments,
count(plan) as number_of_users
from
pay_counts
group by
plan, payments
),
i as (
SELECT
generate_series(1,max(payments)) as payments
FROM
pay_counts
)
select * from plan_pay_counts
order by plan, payments;
In theory, i
will be the column expressing the number of payments.
An optional way of approaching this problem is by:
WITH products AS (
SELECT DISTINCT product FROM tab
), payments AS (
SELECT payment
FROM GENERATE_SERIES(1, (SELECT MAX(payments) FROM tab)) AS payment
)
SELECT p1.product,
p2.payment,
COALESCE(SUM(tab.instances) OVER(PARTITION BY p1.product
ORDER BY p2.payment DESC), 0) AS instances
FROM products p1
CROSS JOIN payments p2
LEFT JOIN tab
ON p1.product = tab.product
AND p2.payment = tab.payments
ORDER BY product, payment
Output:
product | payment | instances |
---|---|---|
Professional | 1 | 2 |
Professional | 2 | 2 |
Professional | 3 | 2 |
Professional | 4 | 1 |
Starter | 1 | 41 |
Starter | 2 | 12 |
Starter | 3 | 4 |
Starter | 4 | 0 |
Team | 1 | 3 |
Team | 2 | 2 |
Team | 3 | 0 |
Team | 4 | 0 |
Check the demo here.