Search code examples
scalaapache-sparkrddapache-spark-mllibrecommendation-engine

How to generate personal recommendations for user which excludes movies he rated in scala which is using spark MLlib ALS algorithm?


I'm currently planning to generate movie recommendations for user using ALS algorithm on MovieLens dataset everything works fine but some time the ALS algorithm return movies which are already rated, I want to exclude them from the recommendations my current try for generating as such recommendations is below.

`val moviesRatedbyUser = ratings.keyBy(_._2.user).lookup(206547)
 println("rated movies are" + moviesRatedbyUser) 
 val candidates = 
 sc.parallelize(movies.keys.filter(!moviesRatedbyUser(_)).toSeq)
 val recommendations = bestModel.get
    .predict(candidates.map((206547, _)))
    .collect()
    .sortBy(- _.rating)
    .take(10)

var i = 1
println("Movies recommended for you:")
recommendations.foreach { r =>
println("%2d".format(i) + ": " + movies(r.product))
i += 1
}`

here I tried to lookup userid in ratings rdd the print statement returned moviesRatedbyUser: Seq[(Long, org.apache.spark.mllib.recommendation.Rating)] = WrappedArray((3,Rating(206547,80,1.0))) I want to know how do I just grab the movieid (80 in this case) so that I can exclude it from recommendations generated


Solution

  • Figured how to do it below is the code

    val moviesForUser = ratings.keyBy(_._2.user).lookup(206547)
    val ratingsformovies =  moviesForUser.toMap.values.map(elem => 
    (elem.product)).toSeq // answer I wanted is this line 
    val candidates = 
    sc.parallelize(movies.keys.filter(!ratingsformovies.contains(_)).toSeq)
    val recommendations = bestModel.get
    .predict(candidates.map((206547, _)))
    .collect()
    .sortBy(- _.rating)
    .take(10)
    
    var i = 1
    println("Movies recommended for you:")
    recommendations.foreach { r =>
    println("%2d".format(i) + ": " + movies(r.product))
    i += 1
    }