I'm building an API in Flask that talks to a neo4j db. One particularly large query (15 minutes+) breaks the API. Breaking means the Docker container in which it runs stops without logging the request. The trouble is I can't reproduce the error when I run the API locally.
What I think I need is a way to run the query using py2neo, then return an arbitrary response without waiting for the query to finish.
def post(self):
g.cypher.run("MATCH a-[r]-b SET r.cost = "
"CASE WHEN r.cost <1 THEN 0.01*exp(4.60517*(r.costx+0.01)) ELSE r.cost END "
"SET r.costx = "
"CASE WHEN r.costx < 1 THEN r.costx + 0.01 ELSE r.costx END "
"RETURN r")
return make_response(jsonify({'success': 'all relationship costs increased'}), 200)
I'm really not an ops guy so any broader insights to this conundrum are most welcome.
How much data do you have in that database? You might be better off running start r=rel(*) ...
why do you return r in the first place?
I would batch your query and add this condition:
START r=rel(*)
WITH r
WHERE r.cost < 1 OR r.costx < 1
WITH r
SKIP {batchSize} LIMIT 100000
SET r.cost = CASE WHEN r.cost < 1 THEN 0.01*exp(4.60517*(r.costx+0.01)) ELSE r.cost END
WITH r WHERE r.costx < 1
SET r.costx = r.costx + 0.01
or better run it in two passes:
START r=rel(*)
WITH r
WHERE r.cost < 1
WITH r
SKIP {batchSize} LIMIT 100000
SET r.cost = 0.01*exp(4.60517*(r.costx+0.01))
and same for cost.x