Search code examples
cassandradatastax-enterprisedatastax-java-driverdatastax-enterprise-graph

Iterating a GraphTraversal with GraphFrame causes UnsupportedOperationException Row to Vertex conversion


The following

    GraphTraversal<Row, Edge> traversal = gf().E().hasLabel("foo").limit(5);
    while (traversal.hasNext()) {}

causes the following Exception:

java.lang.UnsupportedOperationException: Row to Vertex conversion is not supported: Use .df().collect() instead of the iterator

    at com.datastax.bdp.graph.spark.graphframe.DseGraphTraversal.iterator$lzycompute(DseGraphTraversal.scala:92)
    at com.datastax.bdp.graph.spark.graphframe.DseGraphTraversal.iterator(DseGraphTraversal.scala:78)
    at com.datastax.bdp.graph.spark.graphframe.DseGraphTraversal.hasNext(DseGraphTraversal.scala:129)

Exception says to use .df().collect() but gf().E().hasLabel("foo") does not allow you to do .df() afterwards. In other words, method df() is not there for object returned by hasLabel()

I'm using the Java API via dse-graph-frames:5.1.4 along with dse-byos_2.11:5.1.4.


Solution

  • The short answer: You need to cast GraphTraversal to DseGraphTraversal that has df() method. Then use one of spark Dataset methods to collect Rows:

    List<Row> rows =
       ((DseGraphTraversal)graph.E().hasLabel("foo"))
       .df().limit(5).collectAsList();
    

    DseGraphFrame does not yet support full TinkerPop specification. So you can not receive TinkerPop Vertex or Edge objects. ( limit() method is also not implemented in DSE 5.1.x). It is recommended to switch to spark dataset api with df() call, get Dataset<Row> and use Dataset base filtering and collecting

    If you need only Edge/Vertex properties you still can use TinkerPop valueMap() or values()

        GraphTraversal<Row, Map<String,Object>> traversal = graph.E().hasLabel("foo").valueMap();
        while (traversal.hasNext()) {}