Search code examples
neo4jcypher

Can someone recommend a more efficient cypher query that performs the following functionality


I am currently using the following query to delete a batch of nodes from my Neo4j graph database:

String cypherCascadeDel= "UNWIND $conditions AS cond " +
            "MATCH (node:" + tableName + ") " +
            "WHERE ALL(key IN keys(cond) WHERE node[key] = cond[key])"+
            "DETACH DELETE node";

where condition is a list of maps of type<String,Object>. The list can contain up to 1 million entries and each map can contain up to 5 key value pairs.

This query is taking a long time to run - around 15-20 minutes for 100,000 entries. Is there a better way to scale this?


Solution

  • Your predicate does not allow for index lookup, so this is doing label scans.

    The problem is that your UNWIND generates rows, and operations in Cypher execute per row.

    So for 100k entries, your UNWIND generates 100k rows, and so MATCH is called 100k times, resulting in 100k NodeByLabelScan executions.

    Instead, you might want to change your approach, omit the UNWIND and try this.

    ...
    WHERE ANY(map IN $conditions WHERE ALL(key IN keys(map) WHERE node[key] = map[key]))
    ...
    

    This should keep it to a single NodeByLabelScan, and for each node it will check all map conditions in your list, and as soon as it finds one it will accept the node for deletion and move on to the next.