Search code examples
gremlingremlin-serverjanusgraph

Gremlin: Count connections ignoring edges with a parallel edge in the opposing direction


I'm currently working with a graph which indicates connections between vertices. The vertices can be connected in both directions. I am interested in knowing how many vertices are connected to each other regardless both the direction of the connection or if connections exist in both directions.

So for example, in the graph sketched below the total number of connected vertices would be 3 (whilst a simple edge count would tell us there are 4

Example Graph

Due to the directionality of the edges this isn’t the same problem solved by the duplicate edge detection provided by the Tinkerpop recipes Is there a Gremlin query which could help with this count?

I’ve included some example data below:

vertex1 = graph.addVertex(“example","vertex1")
vertex2 = graph.addVertex("example","vertex2")
vertex3 = graph.addVertex("example","vertex3")
vertex4 = graph.addVertex("example","vertex4")

vertex1.addEdge("Connected_to",vertex2)
vertex2.addEdge("Connected_to",vertex1)
vertex2.addEdge("Connected_to",vertex3)
vertex3.addEdge("Connected_to",vertex4)

I’m new to the Gremlin language and I’m having trouble creating a query which counts the number of connections between vertices. It would be great to get some help from you guys as I get to grips with the complexities of Graph queries!


Solution

  • You can dedup() by the two vertices ids. Just make sure to have a consistent order of the two vertices (e.g. order by their id), so that the edge direction has no impact.

    gremlin> g.E()
    ==>e[8][0-Connected_to->2]
    ==>e[9][2-Connected_to->0]
    ==>e[10][2-Connected_to->4]
    ==>e[11][4-Connected_to->6]
    gremlin> g.E().dedup().by(bothV().order().by(id).fold())
    ==>e[8][0-Connected_to->2]
    ==>e[10][2-Connected_to->4]
    ==>e[11][4-Connected_to->6]
    gremlin> g.E().dedup().by(bothV().order().by(id).fold()).count()
    ==>3