memory machine-learning storage state mahout

How to maintain state between restarts when using in-memory storage - Mahout

I'm using Apache Mahout as a recommendation engine. It's great, but I'm running into an issue that I'm not sure how to fix, and may or may not even be fixable...

All the recommendation data is computed and stored in memory. When I restart my machine I lose all that data and have to re-compute it. Is there a way to save what is in memory and then put it back into memory when the machine boots back up? I realize that I may not be asking this question using the correct terminology or even describing the mechanisms at work correctly, but in essence I just want to be able to restart my machine without losing all the data as the computations take a long time to complete.

Any help getting me in the right direction on how to solve this would be helpful. I'm not specifically looking for a Mahout-specific solution, just some help understanding the general problem... I'm in new territory here.

Thanks,

Mark

Solution

For recommendations, you may store the item-item similarities and load them during init. I do not know of an existing implementation in the Mahout distribution. But to quote a discussion from December 2011 on the mailing list:

A model for item-based collaborative filtering simply consists of the precomputed item similarities.

We currently support such a precomputation only as hadoop job, but it should be a matter of an hour to create a class that precalculates the item similarities sequentially using an ItemBasedRecommender.

You can either store these similarities in the database and load them via MySQLJDBCInMemoryItemSimilarity/SQL92JDBCInMemoryItemSimilarity or you can write them to a .csv file and load them via FileItemSimilarity.