Search code examples
pythonscikit-learncross-validation

Get the best model after cross validation


How do I get the best model after a training with k-fold cross-validation without grid search? for example:

model = XGBClassifier(**best_params)

cv_scores = cross_val_score(model, X_train, Y_train, cv=5, scoring='f1')

I am not sure how to get the best model to predict the Y_test, using the X_test data. Thank you


Solution

  • This isn't the purpose of cross_val_score.

    That is to say, cross_val_score doesn't generate multiple models for the purpose of finding the best classifiers to use, but is way of quantifying the generalizability of your model. It trains and tests over subsets of the data you provided it (X_train, and Y_train in this case) to give a sense of whether your model overfits the training data or generalizes to the data provided. If values in the returned list are all close to one another, then you can assume your model is generalizable.

    Once you get that determination, your next process is to train over all the training data you have and then test it against all your training data, to determine the model's accuracy and precision.

    See this article to learn more about how you might use cross_val_score().