Search code examples
graphneo4jorientdbgraph-databasesgremlin

Gremlin query for finding repeating complex non-linear patterns


I am trying to figure out how to match repeated instances of a non-linear pattern/blueprint like in the example usecase below:

Vertex types (each having the property "name"):

  • User
  • Relation
  • Role
  • RoleType

Edge types (without any properties)

  • InRelation, from User to Relation
  • OwnsRole, from User to Role
  • HasRole, from Relation to Role
  • IsOfType, from Role to RoleType

The complex repeating pattern between User UX and User UY that I'm considering is a combination of 5 required relations:

  • User UX is InRelation with Relation RE
  • User UY is InRelation with Relation RE (with UY != UX)
  • User UX OwnsRole Role RO
  • Relation RE HasRole Role RO
  • Role RO IsOfType RoleType RT

Specifically, with RT a specific right vertex, such as the one with name "Child", so that the meaning of the entire pattern becomes that user UX is in relation RE with user UY, user UX has defined a role RO which they assigned to their relation RE with user UY, and which they have given the roletype "Child".

In e.g. OrientDB's SQL dialect using their MATCH syntax, all individual matches of this pattern can be found using the following statement:

SELECT ux.name, uy.name FROM (
MATCH
    {class: User, as: ux} -InRelation-> {class: Relation, as: re} <-InRelation- {class: User, as: uy, where: ($matched.ux != $currentMatch)},
    {class: User, as: ux} -OwnsRole-> {class: Role, as: ro} <-HasRole- {class: Relation, as: re},
    {class: Role, as: ro} -HasType-> {class: RoleType, as: rt, where: (name = 'Child')}
RETURN ux, re, uy, ro, rt).

In Neo4j's CypherQL, a similar statement can be constructed.

I've not yet managed to write an equivalent Gremlin query (mostly because using the Back() clause seems to mess up my traversals), but I understand that once I manage to do so, rewriting it so that it can find repeated instances of the pattern should be doable.

So: given this graph, how do I write a gremlin query that

  1. finds all the direct children of the User vertex with name 'jim' (i.e. the User vertex with name 'jeff'?
  2. find all the direct and indirect children of the User vertex with name 'jim' (i.e. the User vertices with names 'jeff', 'jill' and 'john')?

Solution

  • Some Gremlin statements to rebuild your sample graph would have been helpful, but here's my guess:

    g.V().hasLabel("User").match(
        __.as("ux").out("InRelation").as("re"),
        __.as("re").in("InRelation").as("uy"),
        __.as("ux").out("OwnsRole").as("ro"),
        __.as("re").out("OwnsRole").as("ro"),
        __.as("ro").out("HasType").has("name", "Child")).
      where("ux", neq("uy")).
      select("ux","uy").by("name")
    

    However, as Stephen pointed out in his comment, you seem to be using Gremlin 2. My traversal is for Gremlin 3 (but to be honest, I'm not really aware of any graph database, that's still using TinkerPop 2).