Search code examples
javamemory-managementneo4jgraph-databases

Neo4J 3.x using more memory than Neo4J 2.x? How to avoid that?


I updated my Graph DB from Neo4J 2.0.4 to Neo4J 3.3.3 and when I run it with my app it's now using twice more memory (java process on my Mac) than before. (I run Java 1.8 on my Mac)

When I launch Neo4J 2 it uses about 250Mb for the same tasks and queries. But Neo4J 3 uses about 500Mb.

I thought the updates were supposed to be more efficient?

What would be a possible way to decrease the memory use?


Solution

  • Neo4j is based on the JVM, its memory cnsumption depends of the size of the Heap.

    If you have not configure its size, Neo4j has a heursitic to calcul it, and by the time this heuristic has changed.

    In Neo4j 3.X, there are 3 memory spaces :

    JVM heap

    Heap is used for storing all transaction's data, locks on nodes and relationships, and also to cache query's execution plan.

    You can configure its size in conf/neo4j.conf :

    # Java Heap Size: by default the Java heap size is dynamically
    # calculated based on available system resources.
    # Uncomment these lines to set specific initial and maximum
    # heap size.
    dbms.memory.heap.initial_size=512m
    dbms.memory.heap.max_size=512m
    

    Page cache

    A database does a lot of IO on disk. To be fast, it needs to put in RAM your data. Neo4j 3.X does it with the pagecache mechanism, that put in RAM some parts of its binary data files. So your most used data are in RAM.

    This can be configure in conf/neo4j.conf :

    # The default page cache memory assumes the machine is dedicated to running
     # Neo4j, and is heuristically set to 50% of RAM minus the max Java heap size.
    dbms.memory.pagecache.size=10g
    

    Indexes

    Indexes are used to find rapidly nodes for your queries, but if those indexes are not in RAM, they will be slow.

    There is no way to configure this memory siz in Neo4j, but with a rule of three its easy to calculate (Server RAM = Heap + pagecache + indexes + some free memory for the OS).

    You can calculate the size of your indexes with this command :

    $> du -sh data/databases/graph.db/indexes
    

    Conclusion

    Since Neo4j 2, there are a lot of improvements (functionnalities, performances), so yes it's more efficient.

    More efficient about the finger print on RAM, I don't know, I have never done such test, but RAM is cheap now days.

    The only thing I can tell you is that you should need a lesser heap, the data cache are off heap now. In 2.X, Neo4j kept a graph in memory for fast access.

    But Neo4j is highly configurable, so if you have a tiny server and want to fine tune the memory consumption, you have to set heap and pagecache size.