Search code examples
csvneo4jloadcyphereager

The execution plan for this query contains the Eager operator, which forces all dependent data to be materialized in main memory before proceeding


I am running this code in Neo4j:

//Creates an edge labeled "Mentioned". 
LOAD CSV FROM "file:///datafile.csv" AS row 
MERGE (u:ChatItem {id: toInteger(row[0])}) 
MERGE (t:User {id: toInteger(row[1])})
MERGE (u)-[:Mentioned{timeStamp: toInteger(row[2])}]->(c) 
//userid, userid, timestamp

And I got this warning: The execution plan for this query contains the Eager operator, which forces all dependent data to be materialized in main memory before proceeding

Using LOAD CSV with a large data set in a query where the execution plan contains the Eager operator could potentially consume a lot of memory and is likely to not perform well. See the Neo4j Manual entry on the Eager operator for more information and hints on how problems could be avoided.

Here is an overview of the dataset:

6824,1847,1464235815.0
6865,789,1464239415.0
6906,518,1464243003.0
6934,240,1464243031.0
6968,1482,1464244803.0
6976,1792,1464244811.0
6983,767,1464244818.0

What does it mean and what can I do about it ?


Solution

  • This is because you are creating (or not) nodes and then creating (or not) a relationship between them.

    Within a single Cypher statement, Neo4j has to isolate changes that affect matches further on, e.g. when you CREATE nodes with a label that are suddenly matched by a later MATCH or MERGE operation.

    That's why you have an eager operation.

    To avoid it, you can :

    • Change the MERGE on the relationship to a CREATE (if it's possible)
    • Have 2 scripts: one that creates the nodes, and one that will just create the relationship (MATCH, MATCH, MERGE query type)