Search code examples
machine-learningrecommendation-engine

A production-ready, real-time recommendation engine thats easy to setup


I want to store large number of data ppints for user actions, like likes, tags etc (I have plans for both e-commerce and document management).

With the data points, I want to support functions such as

  1. "users who loved X loved Y,Z" recommendations
  2. "fetch more stuff similar to X,Y" clustering.

By production-ready, real-time; I mean that I can enter data points and make queries at the same time, the server will take care of answering queries and updating scores by itself.


I searched around the interwebs and the solutions that come up are either of:

  1. Data-mining libraries that are mostly academic-oriented and are meant for large batch operations, not for heavy real-time queries
  2. Hadoop/Mahout, which is production-ready and support real time updates and queries, but have a steep learning curve and tough to administer.

Solution

  • For recommenders, Mahout has a non-distributed recommender implementation that does not use Hadoop. In fact, this is the only part that is real-time; the Hadoop-based parts are not.

    I think there is little learning curve to it; see here and here for a pretty complete writeup.

    Mahout in Action chapters 2-5 cover this quite well too.