Search code examples
gremlinamazon-neptunetinkerpop3

What is the Gremlin query to get connected nodes and vertex labels in Amazon Neptune with TinkerPop3?


Using Apache Tinkerpop Gremlin to query a node database. I need to not only get the out edges but the Vertexes that the edges connect. Here is a diagram of the database I am trying to query. The gray boxes above each node are the node IDs. The boxes on the edges are the label and the properties.

Node Diagram

I am able to use the following query (Java):

GraphTraversal<Vertex, Map<Object, Object>> x = g.V("694837")
  .outE()
  .valueMap()
  .with(WithOptions.tokens);

Which returns:

[
  {label=match, p=0.69, id=6ac426c7-c7e1-8385-34f6-8f92a3d53057}, 
  {label=match, p=0.4,  id=76c426c7-c87d-ec2b-255a-a94dee636729}, 
  {label=match, p=0.44, id=6ac426c7-c91e-86a7-d4f8-f6bd9c075739}, 
  {label=match, p=0.79, id=dac426c7-c742-2d09-f7f1-a40fdc43352f}, 
  {label=match, p=0.12, id=d4c426c7-c9be-a9c0-ad12-fffed0f311b0}, 
  {label=match, p=0.03, id=46c426c7-ca5e-e4ea-fd60-695746befd84}, 
  {label=match, p=0.19, id=4cc426c7-c695-c67e-712f-85cd2b359e2f}, 
  {label=has,           id=8ec426c7-c293-0b42-30e1-56bafe0659dd}, 
  {label=is,    p=0.44, id=04c426c7-c1e1-9920-c3d7-afa9c6f6789a}, 
  {label=is,    p=0.49, id=40c426c7-c240-72d1-71f1-aaf02ba19842}
]

What I need is the above but with the vertex id of the connected vertices instead of the edge id.

[
  {label=match, p=0.69, vertex_id=39085}, 
  {label=match, p=0.4,  vertex_id=482928}, 
  {label=match, p=0.44, vertex_id=5948376}, 
  {label=match, p=0.79, vertex_id=23980}, 
  {label=match, p=0.12, vertex_id=873632}, 
  {label=match, p=0.03, vertex_id=961837}, 
  {label=match, p=0.19, vertex_id=184928}, 
  {label=has,           vertex_id=TL}, 
  {label=is,    p=0.44, vertex_id=GS}, 
  {label=is,    p=0.49, vertex_id=M}
]

Solution

  • There are actually a few ways to do this. You can use path().by() to fetch the entire path and return the results in a path structure:

    GraphTraversal<Vertex, Map<Object, Object>> x = g.V("694837")
      .outE().inV()
      .path()
           .by(
                valueMap()
                .with(WithOptions.tokens)
           );
    

    Or you can use project to get a projection from each edge:

    GraphTraversal<Vertex, Map<Object, Object>> x = g.V("694837")
      .project('edge','vertex')
           .by(outE().valueMap()
                .with(WithOptions.tokens))
           .by(outE().inV().valueMap()
                .with(WithOptions.tokens));
    

    Or you can use as-labels and select():

    GraphTraversal<Vertex, Map<Object, Object>> x = g.V("694837")
        .outE().as('edge').
        .inV().as('vertex').
        .select('edge','vertex')
            .by(valueMap().with(WithOptions.tokens));
    

    In the end, a projection may work out the best here:

    GraphTraversal<Vertex, Map<Object, Object>> x = g.V("694837")
      .project('label','p','vertex_id')
           .by(outE().label())
           .by(outE().values('p'))
           .by(out().id());
    

    The optional p maybe a little tricky. Do you just not want that to exist if it is not available? Or can you project an empty value? If the latter, you could replace the second by() with by(outE().coalesce(values('p'),constant('')).

    Good resource for learning Gremlin here: https://www.kelvinlawrence.net/book/PracticalGremlin.html