Search code examples
gremlin

Create a group vertex for each group, and create outgoing edges to the group vertices


I have a gremlin query which groups vertices based on two properties

g.V().hasLabel("PERSON").
  group().
    by(values('favorite_brand', 'favorite_color').fold()).
  next()

It returns a list of each group mapped to the list of the vertices in group

1   {('adidas', 'blue'): [v[123], v[456]]}
2   {('nike', 'red'): [v[789]]}

How can I: for each group, create a vertex with an outgoing edge to all the vertices in that group and also set the new group vertex properties to be the same

For example for the above, I would create two new vertices. Group Vertex 1 would have 'favorite_brand' as adidas and 'favorite_color' as blue and would have two outgoing edges to the two vertices 123 and 456.

Same for Group Vertex 2

Is there a way in gremlin to carry this query or do I have to store the returned hashmap in a variable and for loop in my lambda to create new vertices? I'm familiar with addV step but how would I iterate through each element in the hashmap and then access the list value? Thanks!

I have look at the tinkerpop official documentation to understand group step but then didn't find enough information on how to iterate through results and perform actions


Solution

  • Not having your data, the answer below is built using the air-routes data set. The initial group can be built using:

    g.V().hasId('3','8','12','13','14').
      group().
        by(values('region', 'country').fold()).
      unfold()
    

    which yields

    1   {('US-CA', 'US'): [v[13]]}
    2   {('US-NY', 'US'): [v[12], v[14]]}
    3   {('US-TX', 'US'): [v[3], v[8]]}
    

    From there we can build a query to unroll the group while creating the new nodes and edges.

    g.V().hasId('3','8','12','13','14').
      group().
        by(values('region', 'country').fold()).
      unfold().as('grp').
      addV('group').as('new-node').
        property('region',select(keys).limit(local,1)).
        property('country',select(keys).tail(local)).
      sideEffect(select('grp').select(values).unfold().addE('new-edge').from('new-node'))
    
    

    which shows us the nodes created but will also have created the edges inside the sideEffect.

    1   v[26c2d653-5e9c-3ec9-0854-6ed2a212c63b]
    2   v[80c2d653-5e9c-b838-063c-82f6d21cd6e5]
    3   v[42c2d653-5e9d-31c7-2c02-c7bf72fe8e38]
    

    We can use the query below to verify everything has worked.

    g.V().hasLabel('group').out().path().by(valueMap()).by(id())
    

    Which returns

    1   path[{'region': ['US-CA'], 'country': ['US']}, 13]
    2   path[{'region': ['US-NY'], 'country': ['US']}, 12]
    3   path[{'region': ['US-NY'], 'country': ['US']}, 14]
    4   path[{'region': ['US-TX'], 'country': ['US']}, 3]
    5   path[{'region': ['US-TX'], 'country': ['US']}, 8]
    

    I used Amazon Neptune to build this answer but it should work on other TinkerPop compliant stores.