Search code examples
hadoopmahoutrandom-forest

Mahout: How to Use Random Forests to Make Online Predictions


I just tried the BreimanExample on UCI's glass data after working through this simple example:

https://cwiki.apache.org/MAHOUT/breiman-example.html

My question is, once I create a RandomForest in Mahout, how do I "load it" in order to make predictions with it?

With sklearn in Python this is easy, just pickle the forest to disk and load it later, put it behind a web server for live interaction, easy.

But what about with Mahout and Hadoop? If I build a RandomForest at scale, how do I capture and use the output to make future predictions?


Solution

  • Try to follow this example : https://cwiki.apache.org/MAHOUT/partial-implementation.html BuildForest is for building the model and TestForest code will show you how to load the model to make predictions.