Search code examples
neo4jcypherpy2neo

cypher query in py2neo to update n Nodes with a list of n ids and parameters


Say I want to update a sizeable amount of already existing nodes using data that, for instance, is stored in a pd.Dataframe. Since I know how to write a parametrized query that will handle a single node update, my basic solution is to set this query in a loop and run it for each row in the data frame.

for _,row in df.iterrows():
    query='''MATCH (p:Person)
             WHERE p.name={name} AND p.surname = {surname}
             SET p.description={description} '''


    tx.run(query,name=row['name'],surname=row['surname'],
           description=row['description'])

However, there must be a more direct (and faster) way of passing this information to the query, so that the iteration is "managed" at the server side. Is that true? I haven't been able to find any documentation for that.


Solution

  • Instead of looping like this, with one Cypher query executed per entry, you should gather all that into a list parameter of map objects and make a single Cypher query (you could batch this though if you have > 100k or so entries to process). Michael Hunger has a good blog entry on this approach.

    You can use UNWIND on the list parameter to transform it into rows, and handle everything all at once. Assuming you pass in the list as data:

    UNWIND $data as row
    MATCH (p:Person)
    WHERE p.name = row.name AND p.surname = row.surname
    SET p.description = row.description