Search code examples
groovygremlinbulbsrexster

How to include edges in Gremlin while doing breadth first search?


In my gremlin query, I have the following:

vert.as('x').
both.or(
  _().has("time").filter{ it.time.toInteger() > startTime.toInteger() },
  _().has("isRead"), _().has("isWrite")).dedup().gather.scatter.
store(y).loop('x'){c++ < limit.toInteger()}.iterate();

In my gremlin script, I return y, but y clearly only has vertices in it. I can manually retrieve each of the edges manually by iterating over them for each of the vertices, but I want a list returned that contains only the edges between other nodes returned in the list y.

In particular, I need to be able to recreate the sub-graph returned within a data structure locally, so gremlin is being used to return that information. Other details about my use case that influence these needs are the fact that manually iterating over each of the nodes edges is too slow, since the rexster server I'm running my bulbs script against has to push the data over the wire, and also because if I don't have the set of edges between each of the vertices that were originally returned using the script, then I have to check each vertice encountered along each edge to make sure that it's within the set originally returned; very non-ideal.

Basically, any result should be such that, when I look at any of the returned vertices I can know what vertices are linked within the returned set--without having to do any manual checking or lookups. It should merely be in the dataset.

EDIT 1:

I found that gremlin's tree pipeline capability was really good for doing exactly what I wanted! The problem is, now that I use tree, I need to return it to a form that can be used... I can only return either vertexes or edges, so I can't return tree straight away.

EDIT 2:

espeed is right; I should use bothE to start with. But I have some conditions that I want to satisfy... I almost had it earlier, but I couldn't get the filter to work correctly.

vert.as('l').
bothE.gather.scatter.as('edge').bothV.or(
  _().has("time").filter{ it.getProperty('time').toInteger() >= startTime.toInteger() },
  _().has("isRead"), _().has("isWrite")).
dedup().store(results).as('vertice').back('edge').store(results).back('vertice')
.loop('l'){c++ < limit.toInteger()}.iterate();

I don't understand why I can't use two backs in one pipeline (I get a NullPointerException with this one). The basic problem I want solved is: do a breadth first search, storing only the nodes that satisfy all of the or above, and store the edges between all vertices that pass the test.


Solution

  • Without messing with your code too much, maybe the easiest thing is to do:

    results = [] as Set
    vert.as('l').
    bothE.as('e').gather.scatter.as('edge').bothV.or(
      _().has("time").filter{ it.getProperty('time').toInteger() >= startTime.toInteger()}.store(results),
      _().has("isRead"), _().has("isWrite")).store(results))
    .sideEffect{e,m->results<<m.e}
    .loop('l'){c++ < limit.toInteger()}.iterate();
    

    Note that declaring results as Set you can avoid the dedup step. Basically store the vertices as you filter them in the or step.