Search code examples
mahoutmahout-recommender

The output of Mahout spark-itemsimilarity and its indicators


The output of Mahout (0.11.1) spark-itemsimilarity looks like:
3705021559 3705021558:241.35418715327978 3705021546:163.6168323904276
By my understanding, its format is:
(item)tab(item1:score)tab(item2:score), item1, item2, itemx... are so called indicators.

My question is how to use the indicators?

In some examples like
https://www.mapr.com/products/mapr-sandbox-hadoop/tutorials/recommender-tutorial and https://www.mapr.com/blog/mahout-spark-whats-new-recommenders%E2%80%94part-2,
we index the indicators, and we get the recommendation by query the indicator field, then we get the recommendation. To me it looks like: we form a list of what people bought as an indicator list, and we query Elasticsearch/Solr with the indicator list, and we get the recommended (similar) items. In this approach we query the indicator field to get similar items.

Why is it not simply like: if we know what people bought as a list, and we query ID field to get the indicators as result. In another words, the output we got from spark-itemsimilarity has already told us what items (indicators) are similar to an item?

Maybe I misunderstood the meaning of indicators, anybody please kindly clear my question?


Solution

  • 3705021559 3705021558:241.35418715327978 3705021546:163.6168323904276Is exactly the format (item)tab(item1:score)tab(item2:score)

    The first item is the item being compared to all the rest. So this is saying that compared with 3705021559, 3705021558 has a Log-likelihood ratio of 241.35418715327978 and so on.

    The output matches you input so if 3705021558 is not an item id you may have specified the location of the item in the input. Run spark-itemsimilarity with no params to get help output. you can specify where in the input TSV yous item-id is, where the user-id and where your indicator-name is.

    BTW if you are planning to use this in a recommender try out the Universal Recommender, which has event capture and a recommendation server all integrated. http://templates.prediction.io/PredictionIO/template-scala-parallel-universal-recommendation