Search code examples
gremlintinkerpoptinkerpop3

gremlin intersection operation


I'm using the gremlin console v3.3.1. Using the "Modern" graph from the tutorial: http://tinkerpop.apache.org/docs/current/tutorials/getting-started/

Creating the graph with this:

gremlin>graph = TinkerFactory.createModern()
gremlin>g = graph.traversal()

I can find all the people that know "vadas" like this:

g.V().hasLabel('person').has('name', 'vadas').in('knows').hasLabel('person').valueMap()

And I can find all the people that created the software "lop" with this:

g.V().hasLabel('software').has('name', 'lop').in('created').hasLabel('person').valueMap()

I can find all the people that know "vadas" OR created "lop" with a union operation:

g.V().union(
g.V().hasLabel('person').has('name', 'vadas').in('knows').hasLabel('person'),
g.V().hasLabel('software').has('name','lop').in('created').hasLabel('person')
).dedup().valueMap()

But I can't figure out how to find all the people that know "vadas" AND created "lop". Essentially I want an INTERSECT operation (I think), but there is no such thing that I can find.

Any help?


Solution

  • There are likely other ways to do this, but here's a few that I came up with. The first uses match() step:

    gremlin> g.V().match(
    ......1>   __.as('a').out('created').has('software','name','lop'),
    ......2>   __.as('a').out('knows').has('person','name','josh')).
    ......3>   select('a')
    ==>v[1]
    

    The second just uses and() step:

    gremlin> g.V().and(
    ......1>   out('created').has('software','name','lop'),
    ......2>   out('knows').has('person','name','vadas'))
    ==>v[1]
    

    both could potentially require full scans of of all vertices (not sure what graph databases would optimize those traversals to use indices), so I also tried this:

    gremlin> g.V().has('person','name','vadas').in('knows').hasLabel('person').
    ......1>   V().has('software','name','lop').in('created').hasLabel('person').
    ......2>   path().
    ......3>   filter(union(range(local,1,2), 
    ......4>                range(local,3,4)).
    ......5>          fold().
    ......6>          dedup(local).
    ......7>          count(local).is(1)).
    ......8>   tail(local)
    ==>v[1]
    

    It basically grabs the path() of the first two traversals over V() and then analyzes it to look for matches betweeen path positions. As soon as I saw that traversal, I realized it could all be simplified down to:

    gremlin> g.V().has('person','name','vadas').in('knows').hasLabel('person').as('a').
    ......1>   V().has('software','name','lop').in('created').hasLabel('person').as('b').
    ......2>   select('a').
    ......3>   where('a',eq('b'))
    ==>v[1]