This is an extension from another SO (Neo4j 2.0 Merge with unique constraints performance bug?), but I'm trying it a different way.
MATCH (c:Contact),(a:Address), (ca:ContactAddress)
WITH c,a,collect(ca) as matrix
FOREACH (car in matrix |
MERGE
(c {ContactId:car.ContactId})
-[r:CONTACT_ADDRESS {ContactId:car.ContactId,AddressId:car.AddressId}]->
(a {AddressId:car.AddressId}))
So this leads to a locked up Neo4j server. I'm trying to wrap my head around why.
My thought process behind the query is the following:
When I run the above code, the server sits at about 40% CPU and memory continues to climb. I stopped it after the browser connected disconnected (myserver:7474/browser), reset my database and tried again, this time using the following:
match (c:Contact),(a:Address), (ca:ContactAddress)
WITH c,a,collect(distinct ca) as matrix
foreach (car in matrix |
CREATE
(c {ContactId:car.ContactId})
-[r:CONTACT_ADDRESS {ContactId:car.ContactId,AddressId:car.AddressId}]->
(a {AddressId:car.AddressId}))
Same results. Locked up, disconnected Neo4j database while CPU stays pegged and RAM usage continues to climb. Is there a loop here that I'm not seeing?
I've also tried this (with the same hang):
FOREACH(row in {PassedInList} |
MERGE (c:Contact {ContactId:row.ContactId})
MERGE (a:Address {AddressId:row.AddressId})
MERGE (c)-[r:CONTACT_ADDRESS]->(a)
)
RESOLVED:
MATCH (ca:ContactAddress)
MATCH (c:Contact {ContactId:ca.ContactId}), (a:Address {AddressId:ca.AddressId})
MERGE p = (c)
-[r:CONTACT_ADDRESS {ContactId:ca.ContactId,AddressId:ca.AddressId}]->
(a)
When you write match (c:Contact),(a:Address), (ca:ContactAddress)
, with 3 disconnected nodes, then Neo4j will match every possible cartesian product of those 3. If you had 100 of each type of node, that is 100x100x100 = 1000000 results.
Try this:
MATCH (ca:ContactAddress), (c:Contact {ContactId:ca.ContactId}), (a:Address {AddressId:ca.AddressId})
MERGE (c)-[r:CONTACT_ADDRESS {ContactId:ca.ContactId,AddressId:ca.AddressId}]->(a)
That will match every :ContactAddress
node, and only the :Contact
and :Address
nodes that match it. Then it'll create the relationship (if it didn't already exist).
If you want to be clearer, you could also split the MATCH
, ie:
MATCH (ca:ContactAddress)
MATCH (c:Contact {ContactId:ca.ContactId}), (a:Address {AddressId:ca.AddressId})
MERGE (c)-[r:CONTACT_ADDRESS {ContactId:ca.ContactId,AddressId:ca.AddressId}]->(a)