I have Data which looks like this:
ID | SERVICE_TYPE | CNT |
---|---|---|
12345678 | SACCOS | 0 |
12345678 | RETAIL_SME | 0 |
12345678 | TRANS_LOGISTIC | 0 |
12345678 | HEALTHCARE | 0 |
12345678 | ENERGY | 7 |
12345678 | NGO | 0 |
12345678 | TECH_SERVICES | 0 |
12345678 | BUSINESS_SERVICES | 0 |
12345678 | RELIGIOUS | 0 |
12345678 | INSURANCE | 0 |
12345678 | EDUCATION | 0 |
12345678 | MICROFINANCE | 0 |
12345678 | LOAN_REP_COUNT | 2 |
12345678 | WATER_BILL | 0 |
12345678 | PAY_TV | 0 |
12345678 | MERCHPAY_OP | 1 |
12345678 | MERCHPAY_AM | 0 |
12345678 | BETTING | 0 |
12345678 | AMTk_BUNDLE | 0 |
12345678 | AMTK_AIRTIME_RECH | 1 |
12345678 | ELECTRICITY_BILL | 0 |
12345678 | W2B | 0 |
12345678 | P2P_UNREG | 0 |
12345678 | P2P_OPTK | 1 |
12345678 | P2P_AMTK | 1 |
12345678 | CASHOUT | 0 |
12345678 | IMT | 0 |
12345678 | AGRICULTURE | 0 |
12345678 | AVIATION | 0 |
12345678 | SPORTS | 0 |
12345678 | CONSTRUCTION | 0 |
12345678 | HORECA | 0 |
12345678 | AGGREGATOR | 0 |
12345678 | MEDIA | 0 |
12345678 | FMCG | 0 |
The goal is to find the preferred product of customer, in the way that: I get o understand customer A prefers product X(s), and the next Best offer is Product Y(s), and the Third Best Offer is Product Z (s).
For products that have same number of count they should appear in one column. for instance for this case.
It should look like this:
ID | FIRST_PREF_LEG | SECOND_PREF_LEG | THIRD_PREF_LEG |
---|---|---|---|
12345678 | ENERGY | LOAN_REP_COUNT | AMTK_AIRTIME_RECH,P2P_OPTK,P2P_AMTK |
Another example is if the customer has two product with Equal CNT let say 2, that should be customer's first Pref Leg and all products should be there
If customer has only 0 CNT that we should label that customer as 'Non user' across all Pref Leg.
I have tried this:
WITH RankedServices AS
(
SELECT
ID,
COALESCE(SERVICE_TYPE, 'non-user') AS SERVICE_TYPE,
RANK() OVER (PARTITION BY ID ORDER BY COALESCE(CNT, 0) DESC) AS rnk
FROM
Table_name_tableA
WHERE
COALESCE(CNT, 0) > 0 -- Exclude products with count 0
)
SELECT
ID,
MAX(CASE WHEN rnk = 1 THEN SERVICE_TYPE END) AS First_Pref_Leg,
MAX(CASE WHEN rnk = 2 THEN SERVICE_TYPE END) AS Second_Pref_Leg,
MAX(CASE WHEN rnk = 3 THEN SERVICE_TYPE END) AS Third_Pref_Leg
FROM
RankedServices
GROUP BY
ID;
And here is the result:
ID | FIRST_PREF_LEG | SECOND_PREF_LEG | THIRD_PREF_LEG |
---|---|---|---|
12345678 | ENERGY | LOAN_REP_COUNT | P2P_OP |
While I was expecting:
ID | FIRST_PREF_LEG | SECOND_PREF_LEG | THIRD_PREF_LEG |
---|---|---|---|
12345678 | ENERGY | LOAN_REP_COUNT | AMTK_AIRTIME_RECH,P2P_OPTK,P2P_AMTK |
See example.
Filter WHERE CNT>0
not used, because you will get 'non'
value for this case.
select Id
,coalesce(First_pref_leg,'non-user') AS First_pref_leg
,coalesce(Second_pref_leg,'non-user') AS Second_pref_leg
,coalesce(Third_pref_leg,'non-user') AS Third_pref_leg
from(
SELECT Id
,listagg(CASE WHEN rnk = 1 and cnt>0 THEN SERVICE_TYPE END,',') AS First_Pref_Leg
,listagg(CASE WHEN rnk = 2 and cnt>0 THEN SERVICE_TYPE END,',') AS Second_Pref_Leg
,listagg(CASE WHEN rnk = 3 and cnt>0 THEN SERVICE_TYPE END,',') AS Third_Pref_Leg
FROM
( SELECT *
,dense_RANK() OVER (PARTITION BY ID ORDER BY CNT DESC) AS rnk
FROM tableA
)AS RankedServices
GROUP BY ID
) AS Aggregated;