I runned a K-means example and I have an RDD with my data named parsedData and my model named clusters. I want to create a mapped Rdd with datapoint and prediction cluster from the model. So I tried
val predictions = parsedData.map( point =>
{
val pointPred = clusters.predict(point)
Array(point,pointPred)
})
when I try
predictions.first()
I take
Array[Any] = Array([0.8898668778942382,0.89533945283595], 0)
which is the result I want. So then I tried
predictions.saveAsTextFile ("/../ClusterResults");
to save The Arrays from each datapoint in a local file but the file created was
[Ljava.lang.Object;@3b43c55c
[Ljava.lang.Object;@5e523969
[Ljava.lang.Object;@68374cdf ....
had the objects and not the data. I also tried to print from the RDD like
predictions.take(10).map(println)
and took the objects as a result again. How can I take the data and not the objects and save them to a local file?
The problem lies in the way you map your data. Try using a Tuple, instead of an Array.
Example:
val predictions = parsedData.map( point => {
(point, clusters.predict(point))
})