I stumbled on the following syntax for setting labels during a LOAD CSV
operation in Neo4j with Cypher. It works but I don't understand why, and all of the following modifications break it:
YIELD
statementsYIELD node
to YIELD node2
or any other name on the 2nd and 3rd YIELD
statementsWITH n,row
statements between the YIELD
statementsUNION
between the calls (at least, I couldn't get it to work)Can anyone enlighten me? I'm a relative newcomer to Cypher and APOC and I'd love to understand how to make repeated APOC calls properly.
LOAD CSV WITH HEADERS FROM 'file:///myfile.csv' AS row
MERGE (n:Person{id:row.ID,name:row.Name})
WITH n,row
CALL apoc.create.addLabels(id(n), [row.Title,row.Position] YIELD node
WITH n,row
CALL apoc.create.addLabels(id(n), split(row.Roles, ',')) YIELD node
WITH n,row
CALL apoc.create.addLabels(id(n), split(row.Aliases, ',')) YIELD node
You should read the documentation for CALL, WITH, and UNION.
Here are answers to your specific questions:
When you CALL
a procedure that can return results, you generally must also specify a YIELD
for at least one of the procedure's result fields and use the exact field name. You can use SHOW PROCEDURES to get the signature of a procedure, which includes its result fields. For example, to get the signature of apoc.create.addLabels
:
SHOW PROCEDURES YIELD name, signature
WHERE name = 'apoc.create.addLabels'
RETURN signature
YIELD
must specify at least one of the result field name(s) used in the procedure's signature. You cannot use arbitrary names, although you can immediately use AS
to rename a field (e.g., YIELD officalName AS foo
).
Cypher does not permit multiple YIELD
s to produce the same variable name (in the same scope). To get around this, you can use AS
to rename such variables. For example, this should work:
LOAD CSV WITH HEADERS FROM 'file:///myfile.csv' AS row
MERGE (n:Person{id:row.ID,name:row.Name})
WITH n,row
CALL apoc.create.addLabels(id(n), [row.Title,row.Position]) YIELD node AS _
CALL apoc.create.addLabels(id(n), split(row.Roles, ',')) YIELD node AS __
CALL apoc.create.addLabels(id(n), split(row.Aliases, ',')) YIELD node
RETURN node
(Also FYI: WITH changes the set of variables that are in scope. If any subsequent clauses need a variable, then a WITH
clause must specify that variable.)
As documented, a UNION "combines the results of two or more queries into a single result set". You cannot just put UNION
between arbitrary clauses.