Search code examples
machine-learningrecommendation-engine

Understanding Precision@K, AP@K, MAP@K


I'm currently evaluating a recommender system based on implicit feedback. I've been a bit confused with regard to the evaluation metrics for ranking tasks. Specifically, I am looking to evaluate by both precision and recall.

Precision@k has the advantage of not requiring any estimate of the size of the set of relevant documents but the disadvantages that it is the least stable of the commonly used evaluation measures and that it does not average well, since the total number of relevant documents for a query has a strong influence on precision at k

I have noticed myself that it tends to be quite volatile and as such, I would like to average the results from multiple evaluation logs.

I was wondering; say if I run an evaluation function which returns the following array:

Numpy array containing precision@k scores for each user.

And now I have an array for all of the precision@3 scores across my dataset.

If I take the mean of this array and average across say, 20 different scores: Is this equivalent to Mean Average Precision@K or MAP@K or am I understanding this a little too literally?

I am writing a dissertation with an evaluation section so the accuracy of the definitions is quite important to me.


Solution

  • There are two averages involved which make the concepts somehow obscure, but they are pretty straightforward -at least in the recsys context-, let me clarify them:

    P@K

    How many relevant items are present in the top-k recommendations of your system


    For example, to calculate P@3: take the top 3 recommendations for a given user and check how many of them are good ones. That number divided by 3 gives you the P@3

    AP@K

    The mean of P@i for i=1, ..., K.


    For example, to calculate AP@3: sum P@1, P@2 and P@3 and divide that value by 3

    AP@K is typically calculated for one user.

    MAP@K

    The mean of the AP@K for all the users.


    For example, to calculate MAP@3: sum AP@3 for all the users and divide that value by the amount of users

    If you are a programmer, you can check this code, which is the implementation of the functions apk and mapk of ml_metrics, a library mantained by the CTO of Kaggle.

    Hope it helped!