Search code examples
pythonscikit-learnsvmmlp

Precision, recall, F1 score all have zero value for the minority class in the classification report


I got a warning while using SVM and MLP classifiers from SkLearn package:

C:\Users\cse_s\anaconda3\lib\site-packages\sklearn\metrics_classification.py:1327: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use zero_division parameter to control this behavior. _warn_prf(average, modifier, msg_start, len(result))

Code for splitting dataset

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y)

Code for SVM classifier

from sklearn import svm
SVM_classifier = svm.SVC(kernel="rbf", probability = True, random_state=1)
SVM_classifier.fit(X_train, y_train)
SVM_y_pred = SVM_classifier.predict(X_test)
print(classification_report(y_test, SVM_y_pred))

Code for MLP classifier

from sklearn.neural_network import MLPClassifier
MLP = MLPClassifier(random_state=1, learning_rate = "constant", learning_rate_init=0.3, momentum = 0.2 )
MLP.fit(X_train, y_train)
R_y_pred = MLP.predict(X_test)
target_names = ['No class', 'Yes Class']
print(classification_report(y_test, R_y_pred, target_names=target_names))

The error is same for both classifiers


Solution

  • Classification_report: Sets the value to return when there is a zero division. You can provide 0 or 1 if zero division occur. by the precision or recall formula

    classification_report(y_test, R_y_pred, target_names=target_names, zero_division=0)
    

    I don't know what's your data look like. Here's an example

    Features of cancer dataset

    import pandas as pd
    import numpy as np
    from sklearn import svm
    from sklearn.model_selection import train_test_split
    from sklearn.datasets import load_breast_cancer
    from sklearn.neural_network import MLPClassifier
    from sklearn.metrics import classification_report
    cancer = load_breast_cancer()
    df_feat = pd.DataFrame(cancer['data'],columns=cancer['feature_names'])
    df_feat.head()
    

    Target of dataset:

    df_target = pd.DataFrame(cancer['target'],columns=['Cancer'])
    np.ravel(df_target) # convert it into a 1-d array
    

    Generate classification report:

    X_train, X_test, y_train, y_test = train_test_split(df_feat, np.ravel(df_target), test_size=0.3, random_state=101)
    SVM_classifier = svm.SVC(kernel="rbf", probability = True, random_state=1)
    SVM_classifier.fit(X_train, y_train)
    SVM_y_pred = SVM_classifier.predict(X_test)
    print(classification_report(y_test, SVM_y_pred))
    

    Generate classification report for MLP Classifier:

    MLP = MLPClassifier(random_state=1, learning_rate = "constant", learning_rate_init=0.3, momentum = 0.2 )
    MLP.fit(X_train, y_train)
    R_y_pred = MLP.predict(X_test)
    target_names = ['No class', 'Yes Class']
    print(classification_report(y_test, R_y_pred, target_names=target_names, zero_division=0))