Search code examples
neo4jcypherneo4j-apoc

How to stack consecutive APOC statements in Neo4j Cypher


I stumbled on the following syntax for setting labels during a LOAD CSV operation in Neo4j with Cypher. It works but I don't understand why, and all of the following modifications break it:

  • Removing any of the YIELD statements
  • Changing YIELD node to YIELD node2 or any other name on the 2nd and 3rd YIELD statements
  • Removing any of the repeated WITH n,row statements between the YIELD statements
  • Adding UNION between the calls (at least, I couldn't get it to work)

Can anyone enlighten me? I'm a relative newcomer to Cypher and APOC and I'd love to understand how to make repeated APOC calls properly.

LOAD CSV WITH HEADERS FROM 'file:///myfile.csv' AS row
MERGE (n:Person{id:row.ID,name:row.Name}) 
WITH n,row
CALL apoc.create.addLabels(id(n), [row.Title,row.Position] YIELD node
WITH n,row
CALL apoc.create.addLabels(id(n), split(row.Roles, ',')) YIELD node 
WITH n,row
CALL apoc.create.addLabels(id(n), split(row.Aliases, ',')) YIELD node

Solution

  • You should read the documentation for CALL, WITH, and UNION.

    Here are answers to your specific questions:

    • When you CALL a procedure that can return results, you generally must also specify a YIELD for at least one of the procedure's result fields and use the exact field name. You can use SHOW PROCEDURES to get the signature of a procedure, which includes its result fields. For example, to get the signature of apoc.create.addLabels:

      SHOW PROCEDURES YIELD name, signature
      WHERE name = 'apoc.create.addLabels'
      RETURN signature
      
    • YIELD must specify at least one of the result field name(s) used in the procedure's signature. You cannot use arbitrary names, although you can immediately use AS to rename a field (e.g., YIELD officalName AS foo).

    • Cypher does not permit multiple YIELDs to produce the same variable name (in the same scope). To get around this, you can use AS to rename such variables. For example, this should work:

      LOAD CSV WITH HEADERS FROM 'file:///myfile.csv' AS row
      MERGE (n:Person{id:row.ID,name:row.Name}) 
      WITH n,row
      CALL apoc.create.addLabels(id(n), [row.Title,row.Position]) YIELD node AS _
      CALL apoc.create.addLabels(id(n), split(row.Roles, ',')) YIELD node AS __
      CALL apoc.create.addLabels(id(n), split(row.Aliases, ',')) YIELD node
      RETURN node
      

      (Also FYI: WITH changes the set of variables that are in scope. If any subsequent clauses need a variable, then a WITH clause must specify that variable.)

    • As documented, a UNION "combines the results of two or more queries into a single result set". You cannot just put UNION between arbitrary clauses.