Search code examples
pythonmachine-learningxgboost

ValueError: continuous is not supported with xgboost classification


This is my error

ValueError                                Traceback (most recent call last)
<ipython-input-5-7c13d55b8367> in <module>()
      1 from sklearn.metrics import confusion_matrix, accuracy_score
      2 y_pred = classifier.predict(X_test)
----> 3 cm = confusion_matrix(y_test, y_pred)
      4 print(cm)
      5 accuracy_score(y_test, y_pred)

2nd Frame

/usr/local/lib/python3.7/dist-packages/sklearn/metrics/_classification.py in _check_targets(y_true, y_pred)
     95     # No metrics support "multiclass-multioutput" format
     96     if (y_type not in ["binary", "multiclass", "multilabel-indicator"]):
---> 97         raise ValueError("{0} is not supported".format(y_type))
     98 
     99     if y_type in ["binary", "multiclass"]:

ValueError: continuous is not supported

This is my code

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

dataset = pd.read_csv('NBA_proj_14.csv')
X = dataset.iloc[:, :-13].values
y = dataset.iloc[:, -13].values

Splitting the dataset into the Training set and Test set

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

Training XGBoost on the Training set

from xgboost import XGBClassifier
classifier = XGBClassifier()
classifier.fit(X_train, y_train)

Making the Confusion Matrix

from sklearn.metrics import confusion_matrix, accuracy_score
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)```

Here is my dataset

enter image description here


Solution

  • Here:

    X = dataset.iloc[:, :-13].values
    y = dataset.iloc[:, -13].values
    

    Instead of building a features array X and a target array y, you are splitting your dataset row-wise, which is not what you want.

    You alone know what/where the class you want to predict is, which you want to make your target array. As hinted by the error, when doing classification, building a confusion matrix, you should not be predicting a continuous variable.