python scikit-learn random-forest cross-validation supervised-learning

How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation?

I have an imbalanced dataset containing a binary classification problem. I have built Random Forest Classifier and used k-fold cross-validation with 10 folds.

kfold = model_selection.KFold(n_splits=10, random_state=42)
model=RandomForestClassifier(n_estimators=50)

I got the results of the 10 folds

results = model_selection.cross_val_score(model,features,labels, cv=kfold)
print results
[ 0.60666667  0.60333333  0.52333333  0.73        0.75333333  0.72        0.7
  0.73        0.83666667  0.88666667]

I have calculated accuracy by taking mean and standard deviation of the results

print("Accuracy: %.3f%% (%.3f%%)") % (results.mean()*100.0, results.std()*100.0)
Accuracy: 70.900% (10.345%)

I have computed my predictions as follows

predictions = cross_val_predict(model, features,labels ,cv=10)

Since this is an imbalanced dataset, I would like to calculate the precision, recall, and f1 score of each fold and average the results. How to calculate the values in python?

Solution

When you use cross_val_score method, you can specify, which scorings you can calculate on each fold:

from sklearn.metrics import make_scorer, accuracy_score, precision_score, recall_score, f1_score

scoring = {'accuracy' : make_scorer(accuracy_score), 
           'precision' : make_scorer(precision_score),
           'recall' : make_scorer(recall_score), 
           'f1_score' : make_scorer(f1_score)}

kfold = model_selection.KFold(n_splits=10, random_state=42)
model=RandomForestClassifier(n_estimators=50) 

results = model_selection.cross_val_score(estimator=model,
                                          X=features,
                                          y=labels,
                                          cv=kfold,
                                          scoring=scoring)

After cross validation, you will get results dictionary with keys: 'accuracy', 'precision', 'recall', 'f1_score', which store metrics values on each fold for certain metric. For each metric you can calculate mean and std value by using np.mean(results[value]) and np.std(results[value]), where value - one of your specified metric name.