Search code examples
titangremlintinkerpop3

Gremlin - how do you merge vertices to combine their properties without listing the properties explicitly?


Background: I'm trying to implement a time-series versioned DB using this approach, using gremlin (tinkerpop v3).

enter image description here

I want to get the latest state node (in red) for a given identity node (in blue) (linked by a 'state' edge which contains a timestamp range), but I want to return a single aggregated object which contains the id (cid) from the identity node and all the properties from the state node, but I don't want to have to list them explicitly. (8640000000000000 is my way of indicating no 'to' date - i.e. the edge is current - slightly different from the image shown).

I've got this far:

:> g.V().hasLabel('product').
     as('cid').
     outE('state').
     has('to', 8640000000000000).
     inV().
     as('name').
     as('price').
     select('cid', 'name','price').
     by('cid').
     by('name').
     by('price')

=>{cid=1, name="Cheese", price=2.50}
=>{cid=2, name="Ham", price=5.00}

but as you can see I have to list out the properties of the 'state' node - in the example above the name and price properties of a product. But this will apply to any domain object so I don't want to have to list the properties all the time. I could run a query before this to get the properties but I don't think I should need to run 2 queries, and have the overhead of 2 round trips. I've looked at 'aggregate', 'union', 'fold' etc but nothing seems to do this.

Any ideas?

===================

Edit: Based on Daniel's answer (which doesn't quite do what I want ATM) I'm going to use his example graph. In the 'modernGraph' people-create->software. If I run:

> g.V().hasLabel('person').valueMap()
==>[name:[marko], age:[29]]
==>[name:[vadas], age:[27]]
==>[name:[josh], age:[32]]
==>[name:[peter], age:[35]]

then the results are a list of entities's with the properties. What I want is, on the assumption that a person can only create one piece of software ever (although hopefully we will see how this could be opened up later for lists of software created), to include the created software 'language' property into the returned entity to get:

> <run some query here>
==>[name:[marko], age:[29], lang:[java]]
==>[name:[vadas], age:[27], lang:[java]]
==>[name:[josh], age:[32], lang:[java]]
==>[name:[peter], age:[35], lang:[java]]

At the moment the best suggestion so far comes up with the following:

> g.V().hasLabel('person').union(identity(), out("created")).valueMap().unfold().group().by {it.getKey()}.by {it.getValue()}
==>[name:[marko, lop, lop, lop, vadas, josh, ripple, peter], lang:[java, java, java, java], age:[29, 27, 32, 35]]

I hope that's clearer. If not please let me know.


Solution

  • Since you didn't provide I sample graph, I'll use TinkerPop's toy graph to show how it's done.

    Assume you want to merge marko and lop:

    gremlin> g = TinkerFactory.createModern().traversal()
    ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
    gremlin> g.V(1).valueMap()
    ==>[name:[marko],age:[29]]
    gremlin> g.V(1).out("created").valueMap()
    ==>[name:[lop],lang:[java]]
    

    Note, that there are two name properties and in theory you won't be able to predict which name makes it into your merged result; however that doesn't seem to be an issue in your graph.

    Get the properties for both vertices:

    gremlin> g.V(1).union(identity(), out("created")).valueMap()
    ==>[name:[marko],age:[29]]
    ==>[name:[lop],lang:[java]]
    

    Merge them:

    gremlin> g.V(1).union(identity(), out("created")).valueMap().
               unfold().group().by(select(keys)).by(select(values))
    ==>[name:[lop],lang:[java],age:[29]]
    

    UPDATE

    Thank you for the added sample output. That makes it a lot easier to come up with a solution (although I think your output contains errors; vadas didn't create anything).

    gremlin> g.V().hasLabel("person").
               filter(outE("created")).map(
                 union(valueMap(),
                       outE("created").limit(1).inV().valueMap("lang")).
                 unfold().group().by {it.getKey()}.by {it.getValue()})
    ==>[name:[marko], lang:[java], age:[29]]
    ==>[name:[josh], lang:[java], age:[32]]
    ==>[name:[peter], lang:[java], age:[35]]