Search code examples
graphdatastax-enterprisegremlindatastax-enterprise-graph

DSE graph Batch write with ifnotexist on edges


I am using DSE graph to load data from a excel and preparing addE gremlin queries through java code and at last executing them over DSE graph.

In current testing need to fire 4,00,000 addE gremlin queries with two edge labels.

1) What is best practice to finish this execution in few minutes ? Right now i am giving gremlin queries in 1000 batch to dseSession.executeGraph(new SimpleGraphStatement("")) which leading to exception Method code too large! at groovyjarjarasm.asm.MethodWriter

2) For edge labels in this usecase, my schema defined as single cardinality. Also using custom vertex ids for vertexes. So if a edge already exist then DSE should just ignore it without any exception ?


Solution

  • The query parameter should be a simple array that looks like this:

    [[from1, to1, label1], [from2, to2, label2], ...]
    

    Then your script should look like this:

    for (def triple in arg) {
      def (id1, id2, lbl) = triple
      def v1 = graph.vertices(id1).next()
      def v2 = graph.vertices(id2).next()
      if (!g.V(v1).outE(lbl).filter(inV().is(v2)).hasNext()) {
        v1.addEdge(lbl, v2)
      }
    }
    

    Alternatively:

    for (def triple in arg) {
      def (id1, id2, lbl) = triple
      def v1 = graph.vertices(id1).next()
      if (!g.V(v1).outE(lbl).filter(inV().hasId(id2)).hasNext()) {
        v1.addEdge(lbl, graph.vertices(id2).next())
      }
    }
    

    Try both variants; at least one of them should outperform any other solution.