I'm using neo4j in a glassfish server through a modified version of Alex Smirnov neo4j JCA connector. My version is available here : https://github.com/Riduidel/neo4j-connector I'm using this connector with neo4j 1.8. As a consequence, when i want to use it, i first install the connector in my Glassfish application server, then use this connector in applications wishing to connect to.
It works OK when using it with fresh stores. But, when using it with stores created with previous version, I encounter weird bugs.
Typically, I got today the following stack
javax.resource.spi.ResourceAllocationException: Error in allocating a connection. Cause: Failed to transition org.neo4j.kernel.InternalAbstractGraphDatabase$DefaultKernelExtensionLoader@3bbd53b1 from NONE to STOPPED
...
...
.../* JCA internal exception stack */
...
...
Caused by: com.sun.appserv.connectors.internal.api.PoolingException: Failed to transition org.neo4j.kernel.InternalAbstractGraphDatabase$DefaultKernelExtensionLoader@494b584c from NONE to STOPPED
at com.sun.enterprise.resource.pool.ConnectionPool.createSingleResource(ConnectionPool.java:924)
at com.sun.enterprise.resource.pool.ConnectionPool.createResource(ConnectionPool.java:1185)
at com.sun.enterprise.resource.pool.datastructure.RWLockDataStructure.addResource(RWLockDataStructure.java:98)
... 66 more
Caused by: org.neo4j.kernel.lifecycle.LifecycleException: Failed to transition org.neo4j.kernel.InternalAbstractGraphDatabase$DefaultKernelExtensionLoader@494b584c from NONE to STOPPED
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.init(LifeSupport.java:388)
at org.neo4j.kernel.lifecycle.LifeSupport.init(LifeSupport.java:82)
at org.neo4j.kernel.lifecycle.LifeSupport.start(LifeSupport.java:116)
at org.neo4j.kernel.InternalAbstractGraphDatabase.run(InternalAbstractGraphDatabase.java:227)
at org.neo4j.kernel.EmbeddedGraphDatabase.<init>(EmbeddedGraphDatabase.java:79)
at org.neo4j.kernel.EmbeddedGraphDatabase.<init>(EmbeddedGraphDatabase.java:70)
at com.netoprise.neo4j.AbstractNeo4jManagedConnectionFactory.createDatabase(AbstractNeo4jManagedConnectionFactory.java:165)
at com.netoprise.neo4j.AbstractNeo4jManagedConnectionFactory.createDatabase(AbstractNeo4jManagedConnectionFactory.java:127)
at com.netoprise.neo4j.Neo4jManagedConnectionFactory.createManagedConnection(Neo4jManagedConnectionFactory.java:163)
at com.sun.enterprise.resource.allocator.ConnectorAllocator.createResource(ConnectorAllocator.java:160)
at com.sun.enterprise.resource.pool.ConnectionPool.createSingleResource(ConnectionPool.java:907)
... 68 more
Caused by: java.lang.AssertionError
at org.neo4j.index.impl.lucene.LuceneDataSource.cleanWriteLocks(LuceneDataSource.java:265)
at org.neo4j.index.impl.lucene.LuceneDataSource.cleanWriteLocks(LuceneDataSource.java:260)
at org.neo4j.index.impl.lucene.LuceneDataSource.cleanWriteLocks(LuceneDataSource.java:260)
at org.neo4j.index.impl.lucene.LuceneDataSource.cleanWriteLocks(LuceneDataSource.java:260)
at org.neo4j.index.impl.lucene.LuceneDataSource.<init>(LuceneDataSource.java:185)
at org.neo4j.index.lucene.LuceneIndexProvider.load(LuceneIndexProvider.java:72)
at org.neo4j.kernel.InternalAbstractGraphDatabase$DefaultKernelExtensionLoader.loadIndexImplementations(InternalAbstractGraphDatabase.java:1171)
at org.neo4j.kernel.InternalAbstractGraphDatabase$DefaultKernelExtensionLoader.init(InternalAbstractGraphDatabase.java:1143)
at org.neo4j.kernel.lifecycle.LifeSupport$LifecycleInstance.init(LifeSupport.java:382)
... 78 more
A fast inspection reveals that this exception is linked to an undeletable "write.lock" file. My write.lock file can't be deleted because I guess migration is not over. How can I make sure the migration is done before using it without migrating it outside of Glassfish ?
Is there a way to ahve exclusive store migrations in that context ? And if so, how ? And is it the solution for my problem ?
EDIT 1 Added exception message.
EDIT 2 All this only happen when loaded graph was previously used with a Neo4j 1.5 and now with a Neo4j 1.8 connector. when graph is created by connector, absolutely no error happens.
EDIT 3 Strangely enough, this happens as long as there is no debugger plugged into that code : as soon as I try to debug it, the issue stop appearing. Which make me thinking there may be a migration cleanup mechanism that remvoe the write lock once migration is done, and this cleanup is not performed when using my neo4j JCA connector. Is it a valid observation ?
After further investigations, truth revealed to not be in multiple calls to the EmbeddedGraphDatabase
constructor, but instead to multiple identicail IndexProvider
being loaded.
I use neo4j embedded in an open-source JCA connector.
In this connector, the org.neo4j.kernel.Service
class is replaced by a custom one which contains a workaround regarding service loading for JBoss non shared libraries.
Unfortunatly, in our context, this workaround implies loading twice the index provider :
Why ?
Because, as our neo4j instance is using for application data AND for authentication, neo4j connector jar is put in ${domain}/lib
. As a consequence, due to Classloader delegation in application server, the EAR classloader delegates to the Glassfish library classloader, and find this way the LuceneIndexProvider
. Then, the Glassfish library classloader is directly used to load the same LuceneIndexProvider
class.
This concludes by us having two LuceneIndexProvider
objects, both trying to migrate the lucene index. Which lead to the AssertionError
as the write.lock
file created by the first object should be deleted by the second one, which can't do that.
I've then changed slightly that very specific class to use JBoss workaround only when default loading mechanism do not return any class (seee commit here). This small change worked like a charm, so I think you can considered this issue as fixed.