I want to generate another table based on the data seen in one table.
Customer product_1 product_2 product_3
cust_1 0 1 0
cust_2 1 1 0
cust_3 1 1 1
I am interested in computing the purchase rate for each product.
For example, the purchase rate for product_1 would be
(count of (where product_1 = 1)/(count of (where product_1 = 0) + count of (where product_1 = 1))) * 100
assuming 1 = purchase, 0 = no purchase
i.e. 2/3 * 100 = 66.67
The output table should be as follows-
Products Purchased_quantity Not_purchased_quantity Purchase_rate
product_1 2 1 66.67
product_2 3 0 100
product_3 1 2 33.33
Below is for BigQuery Standard SQL and does not require hardcoded names and rather does all dynamically - for any [reasonable] number of product columns ...
#standardSQL
SELECT product,
SUM(purchase) AS Purchased_quantity,
SUM(1 - purchase) AS Not_purchased_quantity,
ROUND(100 * AVG(purchase), 2) AS Purchase_rate
FROM (
SELECT
SPLIT(kv, ':')[OFFSET(0)] product,
CAST(SPLIT(kv, ':')[OFFSET(1)] AS INT64) purchase
FROM `project.dataset.table` t,
UNNEST(SPLIT(REPLACE(TRIM(TO_JSON_STRING(t), '{}'), '"', ''))) kv
WHERE SPLIT(kv, ':')[OFFSET(0)] != 'Customer'
)
GROUP BY product
If to applied to sample data from y our question - the output is
Row product Purchased_quantity Not_purchased_quantity Purchase_rate
1 product_1 2 1 66.67
2 product_2 3 0 100.0
3 product_3 1 2 33.33