Search code examples
gremlintinkerpop3

simplePath() breaks query


When ading simplePath() inside a match(), my query no longer returns results.

The query attempts to find any event (e.g. "graph database conference") that somehow involves three specific people.

  • "alice" attended the school that hosted the event.
  • "bob" was a hot dog vendor at the event.
  • "marko" provided security for the event.

I'm using match() to find where the three people converge. If there's a better way, please suggest it. Thanks! Just starting to learn gremlin.

Ascii art:

            alice --[enrolled-in]-> gremlin 101 --[offered-by]-> graph db school --[hosted]--------------
                                                                                                        |
                                                                                                        v
bob --[works-for]-> hot dogs r awesome --[subcontractor-of]-> best event planner --[planned]----> graph conference
                                                                                                        ^
                                                                                                        |
                                            marko --[works-for]-> super security --[secured]-------------

Query that works:

g.V().match(
  __.as('alice').hasLabel('person').has('name', 'alice').repeat(__.out()).until(__.hasLabel('event')).as('event'),
  __.as('event').repeat(__.in()).until(__.hasLabel('person').has('name', 'bob')).as('bob'),
  __.as('event').repeat(__.in()).until(__.hasLabel('person').has('name', 'marko')).as('marko')).
  path()

==>[v[0],v[0],v[2],v[5],v[21],v[21],v[13],v[10],v[8],v[21],v[18],v[16],[bob:v[8],alice:v[0],event:v[21],marko:v[16]]]

Note that some vertices appear more than once (and we haven't even added any cycles yet!)

When I add .simplePath() to any of the repeat()s, the query returns nothing. For example, inside the first repeat()

g.V().match(
  __.as('alice').hasLabel('person').has('name', 'alice').repeat(__.out().simplePath()).until(__.hasLabel('event')).as('event'),
  __.as('event').repeat(__.in()).until(__.hasLabel('person').has('name', 'bob')).as('bob'),
  __.as('event').repeat(__.in()).until(__.hasLabel('person').has('name', 'marko')).as('marko')).
  path()

gremlin-console:

alice = g.addV('person').property('name', 'alice').next()
gremlin101 = g.addV('course').property('name', 'gremlin 101').next()
g.addE('enrolled-in').from(alice).to(gremlin101)

school = g.addV('school').property('name', 'graph db school').next()
g.addE('offered-by').from(gremlin101).to(school)

bob = g.addV('person').property('name', 'bob').next()
hotDogs = g.addV('business').property('name', 'hot dogs r awesome').next()
g.addE('works-for').from(bob).to(hotDogs)

eventPlanner = g.addV('business').property('name', 'best event planner').next()
g.addE('subcontractor-of').from(hotDogs).to(eventPlanner)

marko = g.addV('person').property('name', 'marko').next()
security = g.addV('business').property('name', 'super security').next()
g.addE('works-for').from(marko).to(security)

event = g.addV('event').property('name', 'graph conference').next()
g.addE('hosted').from(school).to(event)
g.addE('secured').from(security).to(event)
g.addE('planned').from(eventPlanner).to(event)

Solution

  • match() and simplePath() will almost certainly never work well together. If match() produces a simple path, then match() was really pointless. To find all the matching events, you would do something like this:

    gremlin> g.V().has("person", "name", within("alice","bob","marko")).as("p").
    ......1>   repeat(out().simplePath()).
    ......2>     until(hasLabel("event")).
    ......3>   group().
    ......4>     by("name").
    ......5>     by(group().
    ......6>          by(select("p").by("name")).
    ......7>          by(path().by("name").fold())).unfold().
    ......8>   filter(select(values).count(local).is(3)).
    ......9>   select(keys)
    ==>graph conference
    

    And if you're also interested in the paths from each person to the event:

    gremlin> g.V().has("person", "name", within("alice","bob","marko")).as("p").
    ......1>   repeat(out().simplePath()).
    ......2>     until(hasLabel("event")).
    ......3>   group().
    ......4>     by("name").
    ......5>     by(group().
    ......6>          by(select("p").by("name")).
    ......7>          by(path().by("name").fold())).unfold().
    ......8>   filter(select(values).count(local).is(3)).
    ......9>   select(values).unfold().
    .....10>   select(values)
    ==>[[bob,hot dogs r awesome,best event planner,graph conference]]
    ==>[[alice,gremlin 101,graph db school,graph conference]]
    ==>[[marko,super security,graph conference]]
    

    Note that each row is an array of paths; that's because - in theory - every person could be connected to a specific event in more than just one way. If you're only interested in any connection between a person and an event, you can remove the fold() step from the nested group() step.

    gremlin> g.V().has("person", "name", within("alice","bob","marko")).as("p").
    ......1>   repeat(out().simplePath()).
    ......2>     until(hasLabel("event")).
    ......3>   group().
    ......4>     by("name").
    ......5>     by(group().
    ......6>          by(select("p").by("name")).
    ......7>          by(path().by("name"))).unfold().
    ......8>   filter(select(values).count(local).is(3)).
    ......9>   select(values).unfold().
    .....10>   select(values)
    ==>[bob,hot dogs r awesome,best event planner,graph conference]
    ==>[alice,gremlin 101,graph db school,graph conference]
    ==>[marko,super security,graph conference]