Search code examples
graphapache-sparkverticesedgesspark-graphx

Find mutually Edges with Spark and GraphX


I'm really new to spark and graphx. My question is that if i have a graph with some nodes that have mutual(reciprocally) edges between them, i want to select the edges with a good performace. An example:

Source Dst.

1 2

1 3

1 4

1 5

2 1

2 5

2 6

2 7

3 1

I want to get the result:

1 2

2 1

1 3

3 1

The order may be arbitrary. Have anyone an idea how i can get this?


Solution

  • Try:

    edges.intersection(edges.map(e => Edge(e.dstId, e.srcId))
    

    Note that this compares the Edge.attr values as well. If you want to ignore attr values, then do this:

    edges.map(e=> (e.srcId,e.dstId)).intersection(edges.map(e => (e.dstId, e.srcId)))