Search code examples
rubyneo4jneography

Neo4j Batch Inserts


I used the Neo4j API to batch insert a few million records, which normally would take much longer if done individually. The import finished considerably faster but I do not see the millions of records I inserted. Does Neo4j keep some form of queue for the batch insertions and insert it over time?

If so, how can I see the progress of this queue? I am doing a count and I notice the records are increasing, but at a very slow pace.

I am using the Neography gem's batch insertion (https://github.com/maxdemarzi/neography/wiki/Batch) and the code that does the batch insert is below:

User.find_in_batches(batch_size: 500) do |group|
   $neo4j.batch(*group.map { |user| ["create_unique_node", "users", "id", user.id, user.graph_node_properties] })
end

Running Neo4j 2.1.2 enterprise edition on Ubuntu 12.04 LTS.


Solution

  • Did you shutdown the batch inserter correctly after finishing your insert?

    Which Neo4j version and OS are you using?

    Also make sure your memory config is correct. Configure the memory-mapping fore the nodestore and relationshipstore correctly, see Rik's blog post for that.

    It is not a queue, it is writing those records directly. How do you "check" and count? You must not access the database concurrently.

    Can you share your code?