My import.csv
creates many nodes and merging creates a huge cartesian product and runs in a transaction timeout
since the data has grown so much. I've currently set the transaction timeout to 1 second because every other query is very quick and is not supposed to take any longer than one second to finish.
Is there a way to split or execute this specific query in smaller chunks to prevent a timeout?
Upping or disabling the transaction timeout
in the neo4j.conf
is not an option because the neo4j service needs a restart for every change made in the config.
The query hitting the timeout from my import script:
MATCH (l:NameLabel)
MATCH (m:Movie {id: l.id,somevalue: l.somevalue})
MERGE (m)-[:LABEL {path: l.path}]->(l);
Nodecounts: 1000 Movie, 2500 Namelabel
You can try installing APOC Procedures and using the procedure apoc.periodic.commit.
call apoc.periodic.commit("
MATCH (l:Namelabel)
WHERE NOT (l)-[:LABEL]->(:Movie)
WITH l LIMIT {limit}
MATCH (m:Movie {id: l.id,somevalue: l.somevalue})
MERGE (m)-[:LABEL {path: l.path}]->(l)
RETURN count(*)
",{limit:1000})
The below query will be executed repeatedly in separate transactions until it returns 0.
You can change the value of {limit : 1000}
.
Note: remember to install APOC Procedures according the version of Neo4j you are using. Take a look in the Version Compatibility Matrix.