I want to use eclipse to develop my project with mahout-0.9 and hadoop-2.2.0.
I could run my code with mahout-0.9 successfully. But I faced the problem how could I run my project with hadoop mod? I think I have to install hadoop in my computer, and use command to start it. Then I could run my project in eclipse with hadoop mod.
Since Mahout can use MAHOUT_LOCAL
to determine the local mod or hadoop mod in linux. But when I set the environment variable MAHOUT_LOCAL
to "", it also use local mod, why?
If it is impossible to run mahout with hadoop in eclipse, how could I run my project? Thanks:)
My sample code
package com.predictionmarketing.itemrecommend;
import java.io.File;
import java.io.IOException;
import java.util.List;
import org.apache.mahout.cf.taste.common.TasteException;
import org.apache.mahout.cf.taste.impl.model.file.FileDataModel;
import org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender;
import org.apache.mahout.cf.taste.impl.similarity.PearsonCorrelationSimilarity;
import org.apache.mahout.cf.taste.impl.similarity.UncenteredCosineSimilarity;
import org.apache.mahout.cf.taste.model.DataModel;
import org.apache.mahout.cf.taste.recommender.RecommendedItem;
import org.apache.mahout.cf.taste.recommender.Recommender;
import org.apache.mahout.cf.taste.similarity.ItemSimilarity;
public class ItemRecommend {
public static void main(String[] args) {
try {
DataModel model = new FileDataModel(new File("data/test.txt"));
ItemSimilarity similarity = new UncenteredCosineSimilarity(model);
Recommender recommender = new GenericItemBasedRecommender(model, similarity);
List<RecommendedItem> recommendations = recommender.recommend(2, 10);
for(RecommendedItem recommendation : recommendations) {
System.out.println(recommendation.getItemID() + "," + recommendation.getValue());
}
} catch (IOException e) {
System.out.println("There was an error.");
e.printStackTrace();
} catch (TasteException e) {
System.out.println("There was a Taste Exception");
e.printStackTrace();
}
}
}
Your example is not Hadoop code. The Mahout recommenders come in non-hadoop "in-memory" versions, as you've used in your example, and Hadoop versions. The Hadoop version has a very different API since it calculates all recommendations for all users and puts these in HDFS files. You can run the Hadoop version from the command-line on a machine that is a Hadoop client (knows how to communicate with the Hadoop cluster). Access by typing mahout recommenditembased
and it will print a help screen.
Once you have run the hadoop job on the cluster you will need to write code to lookup the recs for a specific user out of those files.
This is often done by writing code to store the recommendations in a database and using queries to retrieve the recs at runtime.