python machine-learning scikit-learn classification

Classification Report - Precision and F-score are ill-defined

I imported classification_report from sklearn.metrics and when I enter my np.arrays as parameters I get the following error :

/usr/local/lib/python3.6/dist-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. 'precision', 'predicted', average, warn_for) /usr/local/lib/python3.6/dist-packages/sklearn/metrics/classification.py:1137: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. 'recall', 'true', average, warn_for)

Here is the code :

svclassifier_polynomial = SVC(kernel = 'poly', degree = 7, C = 5)

svclassifier_polynomial.fit(X_train, y_train)
y_pred = svclassifier_polynomial.predict(X_test)


poly = classification_report(y_test, y_pred)

When I was not using np.array in the past it worked just fine, any ideas on how i can correct this ?

Solution

This is not an error, just a warning that not all your labels are included in your y_pred, i.e. there are some labels in your y_test that your classifier never predicts.

Here is a simple reproducible example:

from sklearn.metrics import precision_score, f1_score, classification_report

y_true = [0, 1, 2, 0, 1, 2] # 3-class problem
y_pred = [0, 0, 1, 0, 0, 1] # we never predict '2'

precision_score(y_true, y_pred, average='macro') 
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)
0.16666666666666666

precision_score(y_true, y_pred, average='micro') # no warning
0.3333333333333333

precision_score(y_true, y_pred, average=None) 
[...] UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)
array([0.5, 0. , 0. ])

Exact same warnings are produced for f1_score (not shown).

Practically this only warns you that in the classification_report, the respective values for labels with no predicted samples (here 2) will be set to 0:

print(classification_report(y_true, y_pred))


              precision    recall  f1-score   support

           0       0.50      1.00      0.67         2
           1       0.00      0.00      0.00         2
           2       0.00      0.00      0.00         2

   micro avg       0.33      0.33      0.33         6
   macro avg       0.17      0.33      0.22         6
weighted avg       0.17      0.33      0.22         6

[...] UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. 
  'precision', 'predicted', average, warn_for)

When I was not using np.array in the past it worked just fine

Highly doubtful, since in the example above I have used simple Python lists, and not Numpy arrays...