Search code examples
javagraph-databasesgremlintinkerpop3

Depth-first tree traversal in TinkerPop graph


Given a tree-structured TinkerPop graph with vertices connected by labeled parent-child relationships ([parent-PARENT_CHILD->child]), what's the idiomatic way to traverse and find all those nodes?

I'm new to graph traversals, so it seems more or less straightforward to traverse them with a recursive function:

Stream<Vertex> depthFirst(Vertex v) {
  Stream<Vertex> selfStream = Stream.of(v);
  Iterator<Vertex> childIterator = v.vertices(Direction.OUT, PARENT_CHILD);
  if (childIterator.hasNext()) {
    return selfStream.appendAll(
      Stream.ofAll(() -> childIterator)
        .flatMap(this::depthFirst)
    );
  }
  return selfStream;
}

(N.b. this example uses Vavr streams, but the Java stream version is similar, just slightly more verbose.)

I assume a graph-native implementation would be more performant, especially on databases other than the in-memory TinkerGraph.

However, when I look at the TinkerPop tree recipes, it's not obvious what combination of repeat() / until() etc. is the right one to do what I want.

If I then want to find only those vertices (leaf or branch) having a certain label, again, I can see how to do it with the function above:

Stream<Vertex> nodesWithMyLabel = depthFirst(root)
  .filter(v -> "myLabel".equals(v.label()));

but it's far from obvious that this is efficient, and I assume there must be a better graph-native approach.


Solution

  • Here's what I ended up with, after a certain amount of trial and error:

    GraphTraversal<Vertex, Vertex> traversal = 
      graph.traversal().V(parent)
        .repeat(out(PARENT_CHILD))     // follow only edges labeled PARENT_CHILD
        .emit()
        .hasLabel("myLabel");          // filter for vertices labeled "myLabel"
    

    Note that this is slightly different from the recursive version in the original question since I realized I don't actually want to include the parent in the result. (I think, from the Repeat Step docs, that I could include the parent by putting emit() before repeat(), but I haven't tried it.)