Search code examples
neo4jcypherdumpdata-transfer

Loading Neo4j database dump (neo4j-shell)


My database was affected by the bug in Neo4j 2.1.1 that tends to corrupt the database in the areas where many nodes have been deleted. It turns out most of the relationships that have been affected were marked for deletion in my database. I have dumped the rest of the data using neo4j-shell and with a single query. This gives a 1.5G Cypher file that I need to import into a mint database to have my data back in a healthy data structure.

I have noticed that the dump file contains definitions for (1) schema, (2) nodes and (3) relationships. I have already removed the schema definitions from the file because they can be applied later on. Now the issue is that since the dump file uses a single series of identifiers for nodes during node creation (in the following format: _nodeid) and relationship creation, it seems that all CREATE statements (33,160,527 in my case) need to be run in a single transaction.

My first attempt to do so kept the server busy for 36 hours without results. I had neo4j-shell load the data directly into a new database directory instead of connecting to a server. The data files in the new database directory never showed any sign of receiving data, and the message log showed many messages indicating thread blocks.

I wonder what is the best way of getting this data back into the database? Should I load a specific config file? Do I need to allocate a large Java heap? What is the trick to have such a large dump file loaded into a database?


Solution

  • The dump command is not meant for larger scale exports, there was originally a version that did, but it was not included in the product.

    if you have the old database still around, you can try some things:

    • contact Neo4j support to help you recover your data
    • use my store-utils to copy it over to a new db (it will skip all broken records)
    • query the data with cypher and export the results as csv
      • you could use the shell-import-tools for that
      • and then import your data from the CSV using either the shell tools again, or the load csv command or the batch-importer