I have modified the AbstractSimilarity class / UserSimilarity method with the following:
Collection c = multiMap.get(user1);
if(c.contains(user2)){
result = result+0.50;
}
I use the epinions dataset that has two files. One with userid, itemid, rating and a trust network user-user which is stored in the multimap above. The rating set is on the datamodel.
Finally: I would like to add a value to a user (e.g +0.50) if he is on the trust network of the user who asks for the recommendations.
Would it be better to use two datamodels?
Thnaks
You've hit upon a very interesting topic in recommenders: multi-modal or multi-action recommenders. They solve the problem of have several actions by the same users and how to use the data to recommend the primary action using all available data. For instance how to recommend purchases with purchase AND page view data.
To use epinions is good intuition on your part. The problem is that there may be no correlation between trust and rating for an individual user. The general technique you use here is to correlate the two bits of data by using a multi-action indicator. Just adding a weight may have little or no effect and can, in your own real-world data, even produce a negative effect.
The snapshot Mahout 1.0 has a new spark-itemsimilarity CLI job (you can use it like a library too) that takes two actions and correlates the second to the first producing two "indicator" outputs. The primary action is the one you want to recommend, in this case recommending people that an individual might like. The secondary action may be anything but must have the user IDs in common, in epinions it's the trust action. The epinions data is actually what is used to test this technique.
Running both inputs through spark-itemsimilarity
will produce an "indicator-matrix" and a "cross-indicator-matrix" These are the core of any "cooccurrence" recommender. If you want to learn more about this technique I'd suggest bringing it up on the Mahout mailing list: user@mahout.apache.org