I am trying to solve a question with Cypher in Neo4J where I need to find the BEER_STYLE
that has the best combination of smell and looks. The relations are as follows:
(:REVIEW)<-[:REVIEWED]-(:BEER)-[:HAS_STYLE]->(:BEER_STYLE)
One beer sometimes has more than one review and the Reviews nodes have the two properties of interest: look and smell. Sometimes these 2 can be NaN
and I have to find the style that has the best combination of those 2 (which I thought it could be the average of averages).
I tried to code a general idea and asked ChatGPT to polish until it works, but I'm not very confident and wanted to ask here. This is the code I got, can someone help/correct me to improve it?
MATCH (s:STYLE)<-[h:HAS_STYLE]-(b:BEERS)-[rev:REVIEWED]->(r:REVIEWS)
WITH s,
CASE WHEN r.look <> 'NaN' THEN toFloat(r.look) ELSE null END AS look,
CASE WHEN r.smell <> 'NaN' THEN toFloat(r.smell) ELSE null END AS smell
WITH s, AVG(look) AS avg_look, AVG(smell) AS avg_smell
WITH s, avg_look, avg_smell, ((avg_look + avg_smell) / 2) AS total
RETURN s AS STYLE, MAX(total) AS Final_Class
ORDER BY Final_Class DESC
LIMIT 2
Assuming that the look
and smell
values do not need to be stored as strings, you should first change them to floats to avoid having to do conversions every time you use those properties:
MATCH (r:REVIEWS)
SET r.look = CASE WHEN r.look <> 'NaN' THEN toFloat(r.look) END
SET r.smell = CASE WHEN r.smell <> 'NaN' THEN toFloat(r.smell) END
Notes:
NULL
is the default value for CASE
, so it does not need to be specified in an ELSE
.NULL
to a property, the property is removed.After the above one-time change, your use case can be simplified to:
MATCH (s:STYLE)<-[:HAS_STYLE]-(:BEERS)-[:REVIEWED]->(r:REVIEWS)
RETURN s AS STYLE, ((AVG(r.look) + AVG(r.smell)) / 2) AS Final_Class
ORDER BY Final_Class DESC
LIMIT 2
Notes:
Your original query did not need to call the aggregating function MAX
on the total
value for each s
, since there would only be a single total
per s
. This is because the aggregating function AVG
in the following clause would create only a single avg_look
and avg_smell
per s
:
WITH s, AVG(look) AS avg_look, AVG(smell) AS avg_smell