Search code examples
neo4jgremlintinkerpoprexster

Fastest way to delete Neo4j graph from Rexster


My use case requires that I clear the graph as quick as possible, so as to ingest a new graph. I have Millions of Vertices and Edges in the Neo4j graph implemented through Rexster. I tried deleting a graph with 2.8 Million edges using g.E.remove() from the Rexster interface -

localhost:8182/graphs/neo4jsample/tp/gremlin?script=g.E.remove();

It is taking close to 200 Seconds to do this (just deleting the Edges). When my solution is live, I have to empty the graph in less than a maximum of 10 seconds.
I tried to delete the folder in which the neo4j graph (.db) files are saved (/tmp/neo4j-graph by default). Even after doing this, Rexster returns results to gremlin queries (like before neo4j-graph was deleted).I guess this is because the graph is being cached when the Rexster server starts. As progression from the above step I first shutdown the graph

localhost:8182/graphs/neo4jsample/tp/gremlin?script=g.shutdown();

.. and then delete 'neo4j-graph' folder. Then I try to make a new graph folder with

localhost:8182/graphs/neo4jsample/tp/gremlin?script=g = new Neo4jGraph('/tmp/neo4j-graph')

sticking to this location as this is configured in 'rexster.xml'. There is a new 'neo4j-graph' folder now created with all Neo4j '.db' files created. Any gremlin query with g (g.V.map(), g.addVertex([name:"John Doe",age:50])) now throws errors, of which the one that I found intelligible is -

 org.neo4j.graphdb.TransactionFailureException: Database is currently not available. No blocking components

Is there any way to get 'g' working well again in such a case (without restarting Rexster)?
Is there a faster way of deleting the whole graph than the ones I have mentioned?


Solution

  • You are doing it the right way, with shutdown()and data directory deletion. Note that even Neo4j Server works in that fashion (requiring shutdown). Unfortunately that requires a Rexster restart to regenerate the Neo4j graph as fresh.