I am trying to summaries data from my Data Warehouse into a fact table and I want to be able to count up the number of transactions per type that there were per customer per month.
Columns being used are: Customer_id, Transaction_id, transaction_date, Transaction_type
Ideally what I want to get is.
Customer | Month | transaction_type_1 | transaction_type_2 | Total_transactions |
---|---|---|---|---|
12345 | 1 | 18 | 8 | 26 |
12345 | 2 | 23 | 14 | 37 |
67891 | 1 | 14 | 22 | 36 |
I have to put it into a subquery but I get the total number of type 1 transactions for all customers in each month. I have tried unsuccessfully to using partition on top of that but now very far outside my level.
Select
customer_id,
month,
count(transactions_id),
(select count(transactions_id) from DWH where transaction_type = 1),
(select count(transactions_id) from DWH where transaction_type = 2)
FROM DWH
GROUP BY customer_id, month
Incorrect table output looks something like this.
Customer | Month | transaction_type_1 | transaction_type_2 | Total_transactions |
---|---|---|---|---|
12345 | 1 | 432 | 564 | 26 |
12345 | 2 | 456 | 765 | 37 |
In a standalone table I can get the information, but I can not incorporate it into the fact table view.
Standalone this works to get individual counts of each type, but I haven't been able to rework that into a select subquery:
select customer_id, month, count(*)
FROM DWH
WHERE dwh.transaction_type = 1
Group BY dwh.customer_id, month;
Any help would be much appreciated.
You may be getting incorrect results because the filters in the where clause of your respective subqueries do not consider the group by columns i.e.
The table alias d
helps us to distinguish between columns used in the outer/general query and the subqueries.
select
d.customer_id,
d.month,
count(d.transactions_id),
(
select count(transactions_id)
from DWH
where transaction_type = 1 and
customer_id = d.customer_id and
month = d.month
) as transaction_type_1,
(
select count(transactions_id)
from DWH
where transaction_type = 2 and
customer_id = d.customer_id and
month = d.month
) as transaction_type_2
FROM DWH d
GROUP BY d.customer_id, d.month
However, while this approach may work, it would be best if you tested this performance wise on your respective database and evaluate the cost metrics/query plan.
Another approach that may be performant uses case expressions to achieve the result and has been included below:
SELECT
customer_id as Customer,
month,
COUNT(
CASE WHEN transaction_type=1 THEN transactions_id END
) as transaction_type_1,
COUNT(
CASE WHEN transaction_type=2 THEN transactions_id END
) as transaction_type_2,
COUNT(1) as Total_transactions
FROM
DWH
GROUP BY
customer_id, month