Search code examples
apache-sparkrddaccumulator

How do you parallelize accumulator and save it as text file in Spark


I have pattern Accumulator that i want to parralize, how to i do this?

val patternsAcc = sc.collectionAccumulator[List[Patern]]("Paterns Accumulator")
...
...
//can't parallelize
val result = sc.parallelize(patternsAcc.value)
//save to file

Solution

  • The type of patternsAcc.value is java.util.List[List[Patern]], and is not accepted by the sc.parrallelize() method.

    Simply import scala.collection.JavaConversions._, and your code should work because of scala's implicit conversions.