apache-spark apache-spark-mllib collaborative-filtering

How to use Apache Spark ALS (alternating-least-squares) algorithm with limited Rating values

I am trying to use ALS, but currently my data is limited to information about what user bought. So I was trying to fill ALS from Apache Spark with Ratings equal 1 (one) when user X bought item Y (and only such information I provided to that algorithm).

I was trying to learn it (divided data to train/test/validation) or was trying just to learn on all data but at the end I was getting prediction with extremely similar values for any pair user-item (values differentiated on 5th or 6th place after comma like 0,86001 and 0,86002).

I was thinking about that and maybe it is because I can provide only rating equal 1 so does ALS cannot be used in such extreme situation?

Is there any trick with ratings so I could use to fix such problem (I have only information's about what was bought - later I am going to get more data, but at a moment I have to use some kind of collaborative filtering until I will acquire more data - in other words I need to show user some kind of recommendation on startup page I choose ALS for startup page but maybe I use something else, what exactly)?

Ofcourse I was changing parameters like iterations, lambda, rank.

Solution

In this case, the key is that you must use trainImplicit, which ignores Rating's value. Otherwise you're asking it to predict ratings in a world where everyone rates everything 1. The right answer is invariably 1, so all your answers are similar.