scala apache-spark rdd apache-spark-mllib recommendation-engine

How to generate personal recommendations for user which excludes movies he rated in scala which is using spark MLlib ALS algorithm?

I'm currently planning to generate movie recommendations for user using ALS algorithm on MovieLens dataset everything works fine but some time the ALS algorithm return movies which are already rated, I want to exclude them from the recommendations my current try for generating as such recommendations is below.

`val moviesRatedbyUser = ratings.keyBy(_._2.user).lookup(206547)
 println("rated movies are" + moviesRatedbyUser) 
 val candidates = 
 sc.parallelize(movies.keys.filter(!moviesRatedbyUser(_)).toSeq)
 val recommendations = bestModel.get
    .predict(candidates.map((206547, _)))
    .collect()
    .sortBy(- _.rating)
    .take(10)

var i = 1
println("Movies recommended for you:")
recommendations.foreach { r =>
println("%2d".format(i) + ": " + movies(r.product))
i += 1
}`

here I tried to lookup userid in ratings rdd the print statement returned moviesRatedbyUser: Seq[(Long, org.apache.spark.mllib.recommendation.Rating)] = WrappedArray((3,Rating(206547,80,1.0))) I want to know how do I just grab the movieid (80 in this case) so that I can exclude it from recommendations generated

Solution

Figured how to do it below is the code

val moviesForUser = ratings.keyBy(_._2.user).lookup(206547)
val ratingsformovies =  moviesForUser.toMap.values.map(elem => 
(elem.product)).toSeq // answer I wanted is this line 
val candidates = 
sc.parallelize(movies.keys.filter(!ratingsformovies.contains(_)).toSeq)
val recommendations = bestModel.get
.predict(candidates.map((206547, _)))
.collect()
.sortBy(- _.rating)
.take(10)

var i = 1
println("Movies recommended for you:")
recommendations.foreach { r =>
println("%2d".format(i) + ": " + movies(r.product))
i += 1
}