Given the following:
from pyspark import SparkContext, SparkConf
from pyspark.mllib.recommendation import ALS, Rating
r1 = (1, 1, 1.0)
r2 = (1, 2, 2.0)
r3 = (2, 1, 2.0)
ratings = sc.parallelize([r1, r2, r3])
model = ALS.trainImplicit(ratings, 1, seed=10)
res = model.recommendProductsForUsers(2)
I'd like to compute the top k products for every user. In general, users and products could be many and it would be too expensive to create an RDD to use with recommendProducts.
According to Spark version 1.5.0 recommendProductsForUsers
should do the job. However, I am getting the following error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-8-c65e6875ea5b> in <module>()
7 model = ALS.trainImplicit(ratings, 1, seed=10)
8
----> 9 res = model.recommendProductsForUsers(2)
AttributeError: 'MatrixFactorizationModel' object has no attribute 'recommendProductsForUsers'
And, in fact, recommendProductsForUsers
does not appear when listing the methods of matrixFactorizationModel:
print dir(model)
['__class__', '__del__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_java_loader_class', '_java_model', '_load_java', '_sc', 'call', 'load', 'predict', 'predictAll', 'productFeatures', 'rank', 'recommendProducts', 'recommendUsers', 'save', 'userFeatures']
You're looking at the wrong documentation. A simple fact that some operation is implemented in a Scala or Java API it doesn't mean it is exposed to PySpark. If you check PySpark 1.5 API docs you'll see it doesn't provide requested method.
recommendUsersForProducts
and recommendProductsForUsers
have been introduced in PySpark 1.6 with SPARK-10535.