I am using GraphX for the first time and I want to build a Graph incrementally. So I need to connect the first two nodes to an edge knowing that I have 2 RDDs (each one has a single value):
firstRDD: RDD[((Int, Array[Int]), ((VertexId, Array[Int]), Int))]
secondRDD: RDD[((Int, Array[Int]), ((VertexId, Array[Int]), Int))]
I want to connect the first VertexId with the second one. I appreciate your help
Basically, you use map
and case
statements to pick out the VertexIds, then, use RDD.zip
to stitch them together, then another map
to create the final EdgeRDD:
firstRDD.map{
case ((junk1,junk2), ((vertex1, junk3), junk4)) => vertex1
}.zip(
secondRDD.map{
case ((junk1,junk2), ((vertex2, junk3), junk4)) => vertex2
}
).map{ case(vertex1, vertex2) => Edge(vertex1, vertex2, 0) }