performance machine-learning classification kaggle

What is 'mean' f1 score in machine learning?

I know f1 score which uses precision and recall. But, what is 'mean' in mean f1 score? When we use it and how to calculate 'mean'?

EDIT for explicitly explain my problem : I know the f1 score is the harmonic mean of precision and recall. And when we calculate f1 score, multiple classification result are needed to calculate precision and recall.

For example, if we have a data set consists of 1000 instances, we can get 1000 classification result. Then we put it into contingency table, so we can calculate f1 score.

Now it is the point what i'm confused with 'mean' f1 score. We calculate f1 score from the contingency table, but what is 'mean'? Only I can calculate is f1 score, then what is 'mean' and how to calculate 'mean' f1 score?

Solution

The F1 score is a measure for the test accuracy of a binary classification task. In multi-label classification tasks, each document has a F1 score. Therefore, the mean F1 Score is:

$1/N * \sum_{i=1}^{N}{f_1(X_i)}$

Where N is the row's size of the train set