I've been using neo4j
with py2neo
for a couple of weeks now, and up to now it was fine to just do single node transactions, so I would have different node types
class NodeA(GraphObject):
...
class NodeB(GraphObject):
...
# create some nodes from data and simply save them one by one
for data in dataset:
node_a = NodeA(data)
node_b = NodeB(data)
if x:
node_a.related_to_b.add(node_b)
g.merge(node_b)
g.merge(node_a)
Nothing fancy. However, I'm starting to get more nodes and connections, and single transactions don't really work anymore, as expected. I've been looking for ways to do bulk inserts, but can't find any good ressources. The best I've managed to accomplish is using unwind_merge_nodes_query
, which has two issues:
I've been writing functions to handle the above mentioned points, but I feel like I'm missing something and that there's a simpler way to handle batches of data
The unwind_merge_nodes_query
function isn't generally intended to be used directly, although you can do so. Usually, you'd want to use the functions from the py2neo.bulk
module instead, which wrap these functions.
Either way though, that nuance is unlikely to help much with your specific problems. As a client-side library, py2neo can only carry out operations exposed by the Neo4j server and, unfortunately, there exists no good (low level) way to import non-trivial bulk data from the client. Py2neo can't fix that.
If performance is your goal, your best bet might be to instead use a LOAD CSV Cypher statement. Note though that to do this, your input data file will need to be on our visible to the server directly.