Search code examples
gremlin

Gremlin Path query with optional empty/termination of path


For a graph like this, the 'Assembly' vertices can optionally be linked to the 'Template' ones, with a 'typeof' edge. If I want to retrieve the graph hierarchy, the current query uses the path. That works fine when there are no template links, but to get the templates, too, I add an extra hop. Is there a way I can get both types of path in a single query. Obviously, adding the extra 'typeof' hop excludes those without a link to a template, which is not what's required.

g.addV('Model').as('1').
  addV('Assembly').as('2').
  addV('Assembly').as('3').
  addV('TemplateAssembly').as('4').
  addE('child').from('1').to('2').
  addE('child').from('1').to('4').
  addE('typeof').from('2').to('4').
  addE('child').from('2').to('3')

Query that gets the extra path elements for those that are linked (only returns 1)

g.V('64984')
.repeat(out('child').hasLabel('Assembly').dedup().out('typeof'))
.emit().path()

Query that gets all the assemblies (but is missing the extra path element for those with a typeof link)

g.V('64984')
.repeat(out('child').hasLabel('Assembly').dedup())
.emit().path()

Solution

  • Combining the edge label names as shown below seems to be a reasonable approach unless I am missing a subtlety in the question.

    gremlin>  g.V().hasLabel('Model').
    ......1>        repeat(out('child','typeof')).
    ......2>        until(__.not(out())).
    ......3>        path().
    ......4>          by(label)   
    
    ==>[Model,TemplateAssembly]
    ==>[Model,Assembly,Assembly]
    ==>[Model,Assembly,TemplateAssembly]  
    

    The parallel edges from the Model to Assembly and TemplateAssembly, I suspect are not really wanted as separate paths though, as shown here:

    gremlin>  g.V().hasLabel('Model').
    ......1>        repeat(out('child','typeof')).
    ......2>        until(__.not(out())).
    ......3>        path().
    ......4>          by(union(id(),label()).fold())  
    
    ==>[[0,Model],[3,TemplateAssembly]]
    ==>[[0,Model],[1,Assembly],[2,Assembly]]
    ==>[[0,Model],[1,Assembly],[3,TemplateAssembly]]
    

    Changing the data model so that a TemplateAssembly connected to the next Assembly would significantly simplify the query I think.

    Adding the diagram that helped me reason about the graph.

    enter image description here