Search code examples
apache-sparkscala-breeze

Converting a Row Matrix into a Breeze Dense Matrix


I have an MLLIB distributed row matrix in which row order doesn't matter. Is there any way to easily convert this into a breeze dense matrix? I'd imagine a row-by-row mapping might work, but I'm relatively unfamiliar with breeze as a whole.

Edit: Using X.rows.map(x => x.toArray), I've managed to convert it into an RDD of the form org.apache.spark.rdd.RDD[Array[Double]]. I believe this is a step in the right direction...


Solution

  • Do a collect on your RDD. It'll return you an Array[Array[Double]].

    val array = your_rdd.collect()

    One to convert the array of arrays into a matrix would be to do the following:

    val dm = DenseMatrix(array.map(_.toArray):_*)

    Part of the answer was taken from here. Hope this solves the problem.