I'm using JanusGraph to add vertices to a cassandra backed database, and I noticed a large performance discrepancy when it comes to adding a vertex with (1) the addVertex() method provided by the JanusGraph java libraries vs (2) the addV() gremlin traversal function. Why is there such a discrepancy?
I am using JanusGraph version 0.2.0
with cql
as the storage backend. I created a test that compares the time in milliseconds it takes to add and commit a vertex to the graph with three methods: (1) addV()
gremlin function, (2) addV()
gremlin function followed by an next()
step to get the newly created vertex, and (3) the JanusGraph addVertex()
method. I am starting from a completely empty graph storage. The code I used can be found below.
final Builder builder = JanusGraphFactory.build()
.set("storage.backend", "cql")
.set("storage.hostname", Config.get(CommonConfig.cassandra_host));
final JanusGraph graph = builder.open();
long nowMillis = TimeUtils.nowMillis();
graph.traversal().addV("myLabel");
graph.traversal().tx().commit();
System.out.println("(1) - Add vertex traversal only took " + (TimeUtils.nowMillis() - nowMillis) + " millis");
nowMillis = TimeUtils.nowMillis();
graph.traversal().addV("myLabel").next();
graph.traversal().tx().commit();
System.out.println("(2) - Add vertex traversal and next took " + (TimeUtils.nowMillis() - nowMillis) + " millis");
nowMillis = TimeUtils.nowMillis();
graph.addVertex("myLabel");
graph.traversal().tx().commit();
System.out.println("(3) - Add vertex method took " + (TimeUtils.nowMillis() - nowMillis) + " millis");
This is a sample output of running this:
(1) - Add vertex traversal only took 15 millis
(2) - Add vertex traversal and next took 739 millis
(3) - Add vertex method took 682 millis
This hints to me that (3) adding with JanusGraph addVertex
does something similar to (2), but I don't understand why the time differences are so large. What causes (2) and (3) to take order of magnitude longer to run than (1)?
Your first bit of Gremlin that you are testing doesn't actually create a vertex. You are just measuring the creation of a Traversal
object but not actually iterating it. The other two actually create a Vertex
object in the graph. The general recommendation is to not use Graph.addVertex()
as that is not a user focused API - it is meant for graph providers like JanusGraph. Only use the Gremlin language for interacting with you graph and that will give you the widest level of code portability.