Search code examples
pythonflaskneo4jdockerpy2neo

Flask API breaks at long Cypher query


I'm building an API in Flask that talks to a neo4j db. One particularly large query (15 minutes+) breaks the API. Breaking means the Docker container in which it runs stops without logging the request. The trouble is I can't reproduce the error when I run the API locally.

What I think I need is a way to run the query using py2neo, then return an arbitrary response without waiting for the query to finish.

def post(self):
    g.cypher.run("MATCH a-[r]-b SET r.cost = "
    "CASE WHEN r.cost <1 THEN 0.01*exp(4.60517*(r.costx+0.01)) ELSE r.cost END " 
    "SET r.costx = "
    "CASE WHEN r.costx < 1 THEN r.costx + 0.01 ELSE r.costx END "
    "RETURN r")
    return make_response(jsonify({'success': 'all relationship costs increased'}), 200)

I'm really not an ops guy so any broader insights to this conundrum are most welcome.


Solution

  • How much data do you have in that database? You might be better off running start r=rel(*) ...

    why do you return r in the first place?

    I would batch your query and add this condition:

    START r=rel(*)
    WITH r
    WHERE r.cost < 1 OR r.costx < 1
    WITH r
    SKIP {batchSize} LIMIT 100000
    SET r.cost = CASE WHEN r.cost < 1 THEN 0.01*exp(4.60517*(r.costx+0.01)) ELSE r.cost END 
    WITH r WHERE r.costx < 1
    SET r.costx = r.costx + 0.01
    

    or better run it in two passes:

    START r=rel(*)
    WITH r
    WHERE r.cost < 1
    WITH r
    SKIP {batchSize} LIMIT 100000
    SET r.cost = 0.01*exp(4.60517*(r.costx+0.01)) 
    

    and same for cost.x