I have the following data, which represents the distances between two objects.
data = [[('123','234'), 10],
[('134','432'), 12],
]
I would like to insert this into neo4j via py2neo v3:
for e, p in enumerate(data):
#
id_left = p[0][0]
id_right = p[0][1]
distance = p[1]
#
left = Node("_id", id_left)
right = Node("_id", id_right)
G.merge(left)
G.merge(right)
r = Relationship(left,'TO', right, distance=distance)
G.create(r)
#
But I find this to be very, very slow. What's the best of speeding this up? I've looked around but haven't found any code example that illustrates clearly how to go about it
Apparently you are using wrongly py2neo to create nodes, you current code produce the following :
As you can see, the first parameter you give to the Node
object are the label and the second parameter should be a map of properties.
This is slow because MERGE
has nothing to match on.
This is a corrected version of your code that will use a label MyNode
and a property id
:
from py2neo import Graph, Node, Relationship
graph = Graph(password="password")
data = [
[('123','234'), 10],
[('134','432'), 12],
]
for e, p in enumerate(data):
#
id_left = p[0][0]
id_right = p[0][1]
distance = p[1]
#
left = Node("MyNode", id=id_left)
right = Node("MyNode", id=id_right)
graph.merge(left)
graph.merge(right)
r = Relationship(left,'TO', right, distance=distance)
graph.create(r)
Which will produce the following graph :
For most performance when you start to have thousands of MyNode
nodes, you can add a unique constraint on the id
property :
CREATE CONSTRAINT ON (m:MyNode) ASSERT m.id IS UNIQUE;
Now this code is making 3 calls to Neo4j, the most performant is to use cypher directly :
data = [
[('123','234'), 10],
[('134','432'), 12],
]
params = []
for x in data:
params.append({"left": x[0][0], "right": x[0][1], "distance": x[1] })
q = """
UNWIND {datas} AS data
MERGE (m:MyNode {id: data.left })
MERGE (m2:MyNode {id: data.right })
MERGE (m)-[r:TO]->(m2)
SET r.distance = data.distance
"""
graph.run(q, { "datas": params })
Which will result in the same graph as above.