in a Spark process I have an RDD[Try[(A, B)]]
. I have to transform this RDD
using a function f: B => List[C]
. What I want to obtain is an RDD[Try[(A, B, C)]
, in which I have to flatMap
the list obtained from the application of function f
.
I tryed this:
val tryRdd = // Obtain the RDD[Try[(A, B)]]
val transformedRdd =
tryRdd.map {
pair =>
for {
(a, b) <- pair
c <- f(b)
} yield {
(a, b, c)
}
}
Unfortunately what I am obtaining is an RDD[Try[Nothing]]
. Why? Can anyone help me to understand where I am wrong?
I suppose that problem is not really related to RDD
. Probabily RDD
with List
will end with the same result.
The for-comprehension is translated to
pair.flatMap { case (a, b) => f(b).map { case c => (a, b, c) } }
But f(b).map(...)
will give you a List[(A, B, C)]
, not a Try[(A, B, C)]
which you want for the argument of pair.flatMap
. So the code shouldn't compile at all (unless you have a strange implicit conversion in scope).
But if you are using, say, IntelliJ, it can fail to show the error and show an incorrect type (or the other way around, it can show errors in working code): you need to actually build the project to see the real errors.