I am trying to create RFM analysis with Postgresql query. However, i haven't fully completed the query for the Recency dimension.
The query inspired in this article
"https://cooldata.wordpress.com/2014/03/25/an-all-sql-way-to-automate-rfm-scoring/ "
The criteria of the Recency dimension was
and below is the query i've been trying to finish
WITH rfm AS
(SELECT email,
SUM((total_incl_tax)) AS cash,
MAX(decode(order_order.order_date, 2016-01-01, 5, 2016-02-01, 4, 2016-03-01, 3, 2016-04-01, 2, 201605-01, 1)) AS recency,
COUNT(DISTINCT(order_date)) AS frequency
FROM order_order
GROUP BY email)
SELECT rfm.email,
CASE
WHEN rfm.cash >= 2000000 THEN 5
WHEN rfm.cash > 1500000 THEN 4
WHEN rfm.cash > 1000000 THEN 3
WHEN rfm.cash > 500000 THEN 2
WHEN rfm.frequency > 4 THEN 5
WHEN rfm.frequency = 4 THEN 4
WHEN rfm.frequency = 3 THEN 3
WHEN rfm.frequency = 2 THEN 2
WHEN rfm.frequency = 1 THEN 1
else 1
END + rfm.frequency AS rfm_score
--+ Five_years.recency
FROM rfm
GROUP BY rfm.email, rfm.cash,rfm.frequency
ORDER BY rfm.email
The error was :
ERROR: function decode(timestamp with time zone, integer, integer, integer, integer, integer, integer, integer, integer, integer, integer) does not exist Hint: No function matches the given name and argument types. You might need to add explicit type casts. Position: 186
and i assume the error was in this line
MAX(decode(order_order.order_date, 2016-01-01, 5, 2016-02-01, 4, 2016-03-01, 3, 2016-04-01, 2, 2016-05-01, 1)) AS recency
Are there any suggestion in modifying the error line into the criteria stated for the recency dimension? thanks
There is no decode()
function in Postgres. You can substitute this with another CASE
statement:
WITH rfm AS
(
SELECT email,
SUM((total_incl_tax)) AS cash,
MAX(
CASE
WHEN order_order.order_date = '2016-01-01' THEN 5
WHEN order_order.order_date = '2016-02-01' THEN 4
WHEN order_order.order_date = '2016-03-01' THEN 3
WHEN order_order.order_date = '2016-04-01' THEN 2
WHEN order_order.order_date = '2016-05-01' THEN 1
END
) as recency,
COUNT(DISTINCT(order_date)) AS frequency
FROM order_order
GROUP BY email
)
SELECT rfm.email,
CASE
WHEN rfm.cash >= 2000000 THEN 5
WHEN rfm.cash > 1500000 THEN 4
WHEN rfm.cash > 1000000 THEN 3
WHEN rfm.cash > 500000 THEN 2
WHEN rfm.frequency > 4 THEN 5
WHEN rfm.frequency = 4 THEN 4
WHEN rfm.frequency = 3 THEN 3
WHEN rfm.frequency = 2 THEN 2
WHEN rfm.frequency = 1 THEN 1
else 1
END + rfm.frequency + rfm.recency AS rfm_score
FROM rfm
GROUP BY rfm.email, rfm.cash,rfm.frequency
ORDER BY rfm.email
Further reading here: Decode equivalent in postgres