I am trying to plot ROC AUC curve. I got the scores as given below:
# bagged decision trees on an imbalanced classification problem
from numpy import mean
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.ensemble import BaggingClassifier
# generate dataset
X, y = make_classification(n_samples=10000, n_features=2, n_redundant=0,
n_clusters_per_class=1, weights=[0.99], flip_y=0, random_state=4)
# define model
model = BaggingClassifier()
# define evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate model
scores = cross_val_score(model, X, y, scoring='roc_auc', cv=cv, n_jobs=-1)
# summarize performance
print('Mean ROC AUC: %.3f' % mean(scores))
But when I trying to call (for plotting ROC AUC curve) as below:
scores2 = cross_val_predict(model, X, y, cv=cv,method='predict_proba')
I am getting error as
ValueError: cross_val_predict only works for partitions
May I know, how can I modify the code for plotting the curve?
I found some what similar problem in another stackoverflow question. But it is not answered yet.
Use cross_val_score(RFmodel, X, y, scoring='accuracy', cv=cv, n_jobs=-1, error_score='raise')
because it does not fit, it only returns the y_predicted
values.