Search code examples
javaneo4jneo4j-embedded

Neo4j kernel crashing when loading large graph


I'm loading a large number of nodes and relationships into an embedded Neo4j database. After about 10,000 inserts, it dies. If I stay under that point, then everything works great. Queries return as they should, as do inserts. It looks like somehow a database file is getting deleted in the middle of the inserts, which is causing everything to fall apart. My database builds itself from scratch, so if I completely delete my graphdb folder and restart it, it runs exactly the same every time. So how do you handle large embedded Neo4j databases?

Here are the pertinent errors.

From the Java output side

The transactions start not committing:

WorkerThread exception::org.neo4j.graphdb.TransactionFailureException::Unable to commit transaction
org.neo4j.graphdb.TransactionFailureException: Unable to commit transaction
        at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:140)
        ...
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.neo4j.graphdb.TransactionFailureException: commit threw exception
        at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:500)
        at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:385)
        at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:123)
        at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:124)
        ... 4 more
Caused by: javax.transaction.xa.XAException

Then it informs me that there's a missing file in the database:

saction.TransactionImpl.doCommit(TransactionImpl.java:560)

at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:448)
        ... 7 more
Caused by: org.neo4j.kernel.impl.nioneo.store.UnderlyingStorageException: java.io.FileNotFoundException: /home/user/graphdb/schema/label/lucene/_1z6.frq (Protocol error)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.updateLabelScanStore(NeoStoreTransaction.java:814)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.applyCommit(NeoStoreTransaction.java:699)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.doCommit(NeoStoreTransaction.java:631)
        at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.commit(XaTransaction.java:327)
        at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commitWriteTx(XaResourceManager.java:632)
        at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:533)
        at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64)
        at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:548)
        ... 8 more
Caused by: java.io.FileNotFoundException: /home/user/graphdb/schema/label/lucene/_1z6.frq (Protocol error)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
        at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:441)
        at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:306)
        at org.apache.lucene.index.FormatPostingsDocsWriter.<init>(FormatPostingsDocsWriter.java:47)
        at org.apache.lucene.index.FormatPostingsTermsWriter.<init>(FormatPostingsTermsWriter.java:33)
        at org.apache.lucene.index.FormatPostingsFieldsWriter.<init>(FormatPostingsFieldsWriter.java:51)
        at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
        at org.apache.lucene.index.TermsHash.flush(TermsHash.java:113)
        at org.apache.lucene.index.DocInverter.flush(DocInverter.java:70)
        at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:60)
        at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:581)
        at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3587)
        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3552)
        at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:450)
        at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:399)
        at org.apache.lucene.index.DirectoryReader.doOpenFromWriter(DirectoryReader.java:413)
        at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:432)
        at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:375)
        at org.apache.lucene.index.IndexReader.openIfChanged(IndexReader.java:508)
        at org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:109)
        at org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:57)
        at org.apache.lucene.search.ReferenceManager.maybeRefresh(ReferenceManager.java:137)
        at org.neo4j.kernel.api.impl.index.LuceneLabelScanStore.refreshSearcher(LuceneLabelScanStore.java:159)
        at org.neo4j.kernel.api.impl.index.LuceneLabelScanWriter.close(LuceneLabelScanWriter.java:82)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.updateLabelScanStore(NeoStoreTransaction.java:811)
        ... 15 more

Then I can no longer get a new transaction:

WorkerThread exception::org.neo4j.graphdb.TransactionFailureException::Unable to get transaction.
org.neo4j.graphdb.TransactionFailureException: Unable to get transaction.
        at org.neo4j.kernel.InternalAbstractGraphDatabase.transactionRunning(InternalAbstractGraphDatabase.java:1064)
        at org.neo4j.kernel.InternalAbstractGraphDatabase.beginTx(InternalAbstractGraphDatabase.java:1037)
        at org.neo4j.kernel.TransactionBuilderImpl.begin(TransactionBuilderImpl.java:43)
        at org.neo4j.kernel.InternalAbstractGraphDatabase.beginTx(InternalAbstractGraphDatabase.java:1024)
        ...
        at java.lang.Thread.run(Thread.java:745)
Caused by: javax.transaction.SystemException: Kernel has encountered some problem, please perform neccesary action (tx recovery/restart)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.neo4j.kernel.impl.transaction.KernelHealth.assertHealthy(KernelHealth.java:61)
        at org.neo4j.kernel.impl.transaction.TxManager.assertTmOk(TxManager.java:339)
        at org.neo4j.kernel.impl.transaction.TxManager.getTransaction(TxManager.java:725)
        at org.neo4j.kernel.InternalAbstractGraphDatabase.transactionRunning(InternalAbstractGraphDatabase.java:1060)
        ... 7 more
Caused by: javax.transaction.xa.XAException
        at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:560)
        at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:448)
        at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:385)
        at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:123)
        at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:124)
        ... 4 more
Caused by: org.neo4j.kernel.impl.nioneo.store.UnderlyingStorageException: java.io.FileNotFoundException: /home/user/graphdb/schema/label/lucene/_1z6.frq (Protocol error)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.updateLabelScanStore(NeoStoreTransaction.java:814)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.applyCommit(NeoStoreTransaction.java:699)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.doCommit(NeoStoreTransaction.java:631)
        at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.commit(XaTransaction.java:327)
        at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commitWriteTx(XaResourceManager.java:632)
        at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:533)
        at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64)
        at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:548)
        ... 8 more
Caused by: java.io.FileNotFoundException: /home/user/graphdb/schema/label/lucene/_1z6.frq (Protocol error)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
        at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:441)
        at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:306)
        at org.apache.lucene.index.FormatPostingsDocsWriter.<init>(FormatPostingsDocsWriter.java:47)
        at org.apache.lucene.index.FormatPostingsTermsWriter.<init>(FormatPostingsTermsWriter.java:33)
        at org.apache.lucene.index.FormatPostingsFieldsWriter.<init>(FormatPostingsFieldsWriter.java:51)
        at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
        at org.apache.lucene.index.TermsHash.flush(TermsHash.java:113)
        at org.apache.lucene.index.DocInverter.flush(DocInverter.java:70)
        at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:60)
        at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:581)
        at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3587)
        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3552)
        at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:450)
        at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:399)
        at org.apache.lucene.index.DirectoryReader.doOpenFromWriter(DirectoryReader.java:413)
        at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:432)
        at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:375)
        at org.apache.lucene.index.IndexReader.openIfChanged(IndexReader.java:508)
        at org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:109)
        at org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:57)
        at org.apache.lucene.search.ReferenceManager.maybeRefresh(ReferenceManager.java:137)
        at org.neo4j.kernel.api.impl.index.LuceneLabelScanStore.refreshSearcher(LuceneLabelScanStore.java:159)
        at org.neo4j.kernel.api.impl.index.LuceneLabelScanWriter.close(LuceneLabelScanWriter.java:82)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.updateLabelScanStore(NeoStoreTransaction.java:811)
        ... 15 more

From the messages.log side

It starts off getting memory mapping errors. This actually happens first when the database first comes online. But more trickle in before it totally dies:

2015-01-27 21:51:29.112+0000 ERROR [org.neo4j]: [/home/user/graphdb/neostore.nodestore.db] Unable to memory map Unable to map pos=0 recordSize=15 totalSize=1048575
org.neo4j.kernel.impl.nioneo.store.MappedMemException: Unable to map pos=0 recordSize=15 totalSize=1048575
        at org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.<init>(MappedPersistenceWindow.java:59)
        at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.allocateNewWindow(PersistenceWindowPool.java:656)
        at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.expandBricks(PersistenceWindowPool.java:617)
        at org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:144)
        at org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:546)
        at org.neo4j.kernel.impl.nioneo.store.NodeStore.forceGetRecord(NodeStore.java:149)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreIndexStoreView$NodeStoreScan.run(NeoStoreIndexStoreView.java:327)
        at org.neo4j.kernel.impl.api.index.IndexPopulationJob.indexAllNodes(IndexPopulationJob.java:212)
        at org.neo4j.kernel.impl.api.index.IndexPopulationJob.run(IndexPopulationJob.java:107)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Invalid argument
        at sun.nio.ch.FileChannelImpl.map0(Native Method)
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:875)
        at org.neo4j.kernel.impl.nioneo.store.StoreFileChannel.map(StoreFileChannel.java:57)
        at org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.<init>(MappedPersistenceWindow.java:53)
        ... 13 more

Then, and I've figured out that this is when everything falls apart, I get this error in messages.log:

2015-01-27 21:59:41.516+0000 ERROR [org.neo4j]: setting TM not OK. Kernel has encountered some problem, please perform neccesary action (tx recovery/restart) null
javax.transaction.xa.XAException
        at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:560)
        at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:448)
        at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:385)
        at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:123)
        at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:124)
        ...
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.neo4j.kernel.impl.nioneo.store.UnderlyingStorageException: java.io.FileNotFoundException: /home/user/graphdb/schema/label/lucene/_1z6.frq (Protocol error)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.updateLabelScanStore(NeoStoreTransaction.java:814)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.applyCommit(NeoStoreTransaction.java:699)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.doCommit(NeoStoreTransaction.java:631)
        at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.commit(XaTransaction.java:327)
        at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commitWriteTx(XaResourceManager.java:632)
        at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:533)
        at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64)
        at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:548)
        ... 8 more
Caused by: java.io.FileNotFoundException: /home/user/graphdb/schema/label/lucene/_1z6.frq (Protocol error)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
        at org.apache.lucene.store.FSDirectory$FSIndexOutput.<init>(FSDirectory.java:441)
        at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:306)
        at org.apache.lucene.index.FormatPostingsDocsWriter.<init>(FormatPostingsDocsWriter.java:47)
        at org.apache.lucene.index.FormatPostingsTermsWriter.<init>(FormatPostingsTermsWriter.java:33)
        at org.apache.lucene.index.FormatPostingsFieldsWriter.<init>(FormatPostingsFieldsWriter.java:51)
        at org.apache.lucene.index.FreqProxTermsWriter.flush(FreqProxTermsWriter.java:85)
        at org.apache.lucene.index.TermsHash.flush(TermsHash.java:113)
        at org.apache.lucene.index.DocInverter.flush(DocInverter.java:70)
        at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:60)
        at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:581)
        at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3587)
        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3552)
        at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:450)
        at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:399)
        at org.apache.lucene.index.DirectoryReader.doOpenFromWriter(DirectoryReader.java:413)
        at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:432)
        at org.apache.lucene.index.DirectoryReader.doOpenIfChanged(DirectoryReader.java:375)
        at org.apache.lucene.index.IndexReader.openIfChanged(IndexReader.java:508)
        at org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:109)
        at org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:57)
        at org.apache.lucene.search.ReferenceManager.maybeRefresh(ReferenceManager.java:137)
        at org.neo4j.kernel.api.impl.index.LuceneLabelScanStore.refreshSearcher(LuceneLabelScanStore.java:159)
        at org.neo4j.kernel.api.impl.index.LuceneLabelScanWriter.close(LuceneLabelScanWriter.java:82)
        at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.updateLabelScanStore(NeoStoreTransaction.java:811)
        ... 15 more
2015-01-27 21:59:41.519+0000 ERROR [org.neo4j]: TM error tx commit commit threw exception

Any ideas on what's causing that .frq file to disappear?


Solution

  • We resolved it in a side-conversation, maximum open files was too low (4000) which is also reported at startup.

    That causes Lucene to break internally.

    After increasing the limit the OP could import the data successfully.