Search code examples
scalaapache-sparkcollaborative-filtering

input for Alternating Least Square


We are using ALS for a recommender model based on user/click data via Spark/Scala.

The rating column is a score [0,1]

val als = new ALS()
    .setImplicitPrefs(true)
    .setRank(myrank)
    .setRegParam(mylambda)
    .setAlpha(myalpha)
    .setMaxIter(numIter)
    .setUserCol("myuseridx")
    .setItemCol("myitemidx")
    .setRatingCol("rating")
val model = als.fit(training)

My question is: must the input data for implicit models technically contain all user item combinations, i.e. also the ones which were not bought?


Solution

  • ALS solves the recommender problem by fixing the user or the item matrix and solving it using least squares. Essentially, for an implicit dataset, it means that all items that are not set to one considered zeros. So you'd only need to include the positive observations.

    Some more discussion here: http://yifanhu.net/PUB/cf.pdf