Search code examples
pythonscikit-learnprecision-recall

How do I use sklearn.metrics to compute micro/macro measures for multilabel classification task?


I have results produced by a multilabel classifier for which I want to compute micro- and macro-precision, recall and F1 scores using sklearn.metrics in python, but can't quite figure out how.

I have two binary sparse matrices, dataOUT and dataGT that represent, respectively, classification results and ground truth for the same dataset. Both matrices are of size nLabels X nSamples. Each sample can be labelled with one or more labels, so dataOUT[i,j] is 1 if the classifier labeled jth sample with ith label, and 0 otherwise.

For any given class i, I can easily calculate regular precision, recall and F-score by extracting ith rows from dataOUT and dataGT can feeding those to sklearn.metrics.precision_recall_fscore_support, e.g. something like this:

import numpy as np
from sklearn.metrics import precision_recall_fscore_support

iLabel = 5 # some specific label

yOUT = np.asarray(dataOUT[iLabel,:].todense()).reshape(-1)
yGT = np.asarray(dataGT[iLabel,:].todense()).reshape(-1)

ps,rs,fs,ss = precision_recall_fscore_support(yGT,yOUT)
p = ps[1]   # Precision for iLabel
r = rs[1]   # Recall for iLabel
f1 = fs[1]  # F1 for iLabel

But how do I calculate micro- and macro-measures for the entire dataset, i.e. how do I get a single micro-(P,R,F) and a single macro-(P,R,F) triples for the (dataOUT,dataGT) pair rather than for each label separately?

Thx!


Solution

  • Most of metrics in scikit-learn support multilabel arguments. sklearn.metrics.precision_recall_fscore_support If documentation says:

    1d array-like, or label indicator array / sparse matrix

    you can just feed metric with whole y matrix and ground true matrix. But these matrices must have shape [n_samples, n_labels], in other words every row of that matrix must correspond to a set of labels for same sample, and every column - to some label. So you should transpose your matrices.

    ps,rs,fs,ss = precision_recall_fscore_support(dataGT.T, dataOUT.T)
    

    Also read this Multiclass and multilabel classification