Search code examples
apache-sparkspark-graphx

GraphX: Given one VertexID get all connected Vertices


So basically I have a graph and an ID of a specific vertex in a graph in GraphX.

Given that VertexID, how do I get all directly connected vertexes to that one vertex? (IE, only one edge away).

Thank you


Solution

  • Lets assume you want to find all users directly connected to "franklin" (VertexId 5L) using example graph from the GraphX Programming Guide. The simplest and probably the most efficient approach is to use collectNeighborIds / graph.collectNeighbors followed by lookup:

    import org.apache.spark.graphx.EdgeDirection
    
    val direction: EdgeDirection = ???  // In, Out ...
    graph.collectNeighborIds(direction).lookup(5L)
    

    Another approach is to use triplets and filter the results:

    // "franklin" is source
    graph.triplets.collect {
      case t if t.srcId == 5L => t.dstId
    }
    

    Of course you can add other direction and pass additional information like srcAttr, dstAttr or vertexAttr. If you prefer to keep complete triplet you can replace collect with filter. Nevertheless if you need single edge / vertex lookups Spark is most likely not the best tool for the job.