Some Common Questiones about Neo4J 3.0

1.Does enterprise version support distributed graph algorithm?Or can the Neo4J graph data and graph calculation be distributed over cloud infrastruction?And How does it work?

2.If I have a server (16 cores CPU,256G memory,2TB HDD) and each node or relation has 1K data,how many nodes and relationships can the server contain.The ratio between nodes and relationships is 1:5. If we want import more data,what should we do?

3.For fast importing , we used batchinserter,but one lucence index has a number limit which is 2^32.So we can import less than 2^32 nodes.What should we do to solve this limit except using more indexes?

4.And after two days for importing, the importing speed is too slow(200-600 nodes per sec) to accept. It is only 1% of the beginning !I can see that the memory is full, what should we do to impove the speed. It has imported about 0.2B nodes and 0.5B relationships. That's half of my data. And my server has 32GB memory.

Thanks a lot.

Solution

You have many questions here that might be better suited as individual questions or asked on the Neo4j slack channel.

I started to write this as a comment, but ran out of chars so I'll try to point you to some resources:

1) Neo4j distributed graph model

I'm not sure exactly what you're asking here. See this document for general information on Neo4j scalability. If you're asking if graph traversals can be distributed across machines in a Neo4j cluster, the answer is no.

2) Hardware sizing

This depends a bit on your access patterns. See the hardware sizing calculator that can take workload / access patterns into account.

3-4) Import

Can you create a new question and share your code for this? You should be able to achieve much better performance than this.