Search code examples
gremlinupsertamazon-neptune

Gremlin - Upsert of vertex not working with coalesce


I am new to gremlin and have a very simple case where I need to check this:

  • If vertex exists
    • Update properties
  • Else
    • Add vertex with properties

I am using the Java API for this.

My code:

g.V().hasLabel("Entity").has("identifier", "123").fold()
.coalesce(
    __.unfold(),
    __.addV("Entity")
        .property("identifier", "123")
        .property("value", "A")
        .property("action", "add")
    )
.property("value", "A")
.property("action", "update")
.iterate();

I know that this is a very simple case and I referred to the example given in [CosmosDB Graph : "upsert" query pattern

But it doesn't work. If the vertex doesn't exist, it is added with the properties but the properties are are updated also.


Solution

  • When you write Gremlin you need to think in terms of streams. V() produces a stream of all vertices in the graph. Envision each item in that stream hitting hasLabel() and has() filters to be paired away until they hit the reducing step of fold() which produces a List with vertices that match the filter criteria or, if there are none, it simple produces an empty list which becomes the new object in the stream.

    From there coalesce() produces an if-then sort of case where the first child stream provided to it that returns a value ends up being exhausted and the remaining child streams ignored. Therefore, if unfold() which takes the List with vertices produced by fold() contains a vertex, then it is provided to the stream and that vertex exists and coalesce() thus produces that existing vertex and goes on to the final two steps of property("value", "A").property("action", "update"). If the List is empty then the unfold() stream produces no objects and goes to the next child stream which starts with addV(). The addV() stream will obviously produce a new Vertex with the specified properties, but then coalesce() as its parent will produce that newly added vertex to the stream and it too will continue to those final two steps and thus overwrite the property values you provided to addV().

    If you want to have two separate paths, then you might do something like this:

    g.V().hasLabel("Entity").has("identifier", "123").fold()
    .coalesce((Traversal)
        __.unfold()
            .property("value", "A")
            .property("action", "update"),
        __.addV("Entity")
            .property("identifier", "123")
            .property("value", "A")
            .property("action", "add")
        )
    .iterate();
    

    More information about upserting vertices can be found on this StackOverflow question

    UPDATE: As of TinkerPop 3.6.0, the fold()/coalesce()/unfold() pattern has been largely replaced by the new steps of mergeV() and mergeE() which greatly simplify the Gremlin required to do an upsert-like operation. Under 3.6.0 and newer versions, you would write:

    g.mergeV([(label): 'Entity', identifier: '123', value: 'A']).
      option(onCreate, [action: 'add']).
      option(onMatch, [action: 'update'])