Search code examples
gremlintinkerpoptinkerpop3

Using derived values to filter gremlin traversals


Good morning!

I have the following data model where actions follow a journey that can be uniquely identified by the connecting edges having a label that matches a Journey ID. See below for a sample.

Data Model

What I'm trying to achieve is that I can group each unique journey together and give them a count. For example, in the data above, if Jeremy woke up in the morning and ate eggs, and then in the evening ate toast, I would want to see:

Jeremy/Morn->Eats->Eggs->JourneyEnd, count: 1

Jeremy/Eve->Eats->Toast->JourneyEnd, count: 1

Instead I (understandably) get:

Jeremy/Morn->Eats->Eggs->JourneyEnd

Jeremy/Eve->Eats->Toast->JourneyEnd

Jeremy/Morn->Eats->Toast->JourneyEnd

Jeremy/Eve->Eats->Eggs->JourneyEnd

I've tried filtering using repeat, and statements like:

g.V().hasLabel('UserJourney').as('root').
out('firstStep').repeat(
    outE().filter(
        label().is(select('root').by(id())))).
until(hasLabel('JourneyEnd')).path()

but (I think) because of the way the traversal works, it is not viable as the root step contains all Journeys by the time I go back to read it.

Any suggestions on how to get to the output I'm looking for is most welcome. The setup script is below:

g.addV('UserJourney').property(id, 'Jeremy/Morn').
  addV('UserJourney').property(id, 'Jeremy/Eve').
  addV('JourneyStep').property(id, 'I Need').
  addV('JourneyStep').property(id, 'Eats').
  addV('JourneyStep').property(id, 'Eggs').
  addV('JourneyStep').property(id, 'Toast').
  addV('JourneyEnd').property(id, 'JourneyEnd').
  
  addE('Jeremy/Morn').from(V('Eats')).to(V('Eggs')).
  addE('Jeremy/Morn').from(V('Eggs')).to(V('JourneyEnd')).
  addE('firstStep').from(V('Jeremy/Morn')).to(V('Eats')).

  addE('Jeremy/Eve').from(V('Eats')).to(V('Toast')).
  addE('Jeremy/Eve').from(V('Toast')).to(V('JourneyEnd')).
  addE('firstStep').from(V('Jeremy/Eve')).to(V('Eats')).
  iterate()

Solution

  • You can use the path, from and where...by steps to achieve what you need.

    gremlin>  g.V().hasLabel('UserJourney').as('a').out().
    ......1>        repeat(outE().where(eq('a')).by(label).by(id).inV()).
    ......2>        until(hasLabel('JourneyEnd')).
    ......3>        path().
    ......4>          from('a')   
    
    ==>[v[Jeremy/Morn],v[Eats],e[3][Eats-Jeremy/Morn->Eggs],v[Eggs],e[4][Eggs-Jeremy/Morn->JourneyEnd],v[JourneyEnd
    ]]
    ==>[v[Jeremy/Eve],v[Eats],e[6][Eats-Jeremy/Eve->Toast],v[Toast],e[7][Toast-Jeremy/Eve->JourneyEnd],v[JourneyEnd
    ]]    
    

    To remove the edges from the result a flatMap can be used

    gremlin>  g.V().hasLabel('UserJourney').as('a').out().
    ......1>        repeat(flatMap(outE().where(eq('a')).by(label).by(id).inV())).
    ......2>        until(hasLabel('JourneyEnd')).
    ......3>        path().
    ......4>          from('a')  
    
    ==>[v[Jeremy/Morn],v[Eats],v[Eggs],v[JourneyEnd]]
    ==>[v[Jeremy/Eve],v[Eats],v[Toast],v[JourneyEnd]]