Search code examples
pythoncross-validationxgboostk-fold

Python - cross_val_score on regression problem is not working - 'ValueError: continuous is not supported'


I am trying to use kFold on an XGBoost regression problem. A sample of the data is this:

enter image description here

When I use the following code:

df = pd.read_csv('../data/df_samp.csv').head(1000)
cat_columns = ['primary_use','meter','hour','weekday','month','wind_compass']
df_processed = pd.get_dummies(df, prefix_sep="_", columns=cat_columns)
X=df_processed.drop(['meter_reading','outlier_ratio','meter_reading_roll_avg','timestamp'],axis=1)
y=df_processed['meter_reading']

scores = []
model = XGBClassifier()
cv = KFold(n_splits=10, shuffle=False)

for train_index, test_index in cv.split(X):
    print("Train Index: ", train_index, "\n")
    print("Test Index: ", test_index)
    X_train, X_test, y_train, y_test = X.values[train_index], X.values[test_index], y.values[train_index], y.values[test_index]
    model.fit(X_train,y_train)
    y_pred=model.predict(X_test)
    predictions = [round(value) for value in y_pred]
    scores.append(r2_score(y_test,predictions))

I get the output

print(scores)
[0.406908684278529, 0.3320925821156784, 0.1039843686445262, 0.395466094618815, 0.13412072574647682, -0.015579242639622182, -0.17008382837529967, 0.3931056789610018, 0.4491969042604125, 0.49641651402527265]

When I try

scores = []
model = XGBClassifier()
cv = KFold(n_splits=10, random_state=42, shuffle=False)
cross_val_score(model, X.values, y.values, cv=10)

I get

ValueError: continuous is not supported

Does anybody know why?

Thank you


Solution

  • Thank you MrSoLoDolo for your suggestion.

    I needed to use XGBRegression() instead of XGBClassifier()