Search code examples
pythonscikit-learnregressioncross-validationk-fold

Metric for K-fold Cross Validation for Regression models


I wanted to do Cross Validation on a regression (non-classification ) model and ended getting mean accuracies of about 0.90. however, i don't know what metric is used in the method to find out the accuracies. I know how splitting in k-fold cross validation works . I just don't know the formula that the scikit learn library is using to calculate the accuracy of prediction. (I know how it works for classification model though). Can someone give me the metric/formula used by sklearn.model_selection.cross_val_score?

Thanks in advance.

from sklearn.model_selection import cross_val_score
def metrics_of_accuracy(classifier , X_train , y_train) :
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10)
accuracies.mean()
accuracies.std()
return accuracies

Solution

  • By default, sklearn uses accuracy in case of classification and r2_score for regression when you use the model.score method(same for cross_val_score). So r2_score in this case whose formula is r2 = 1 - (SSE(y_hat)/SSE(y_mean)) where SSE(y_hat) is the squared error for predictions made SSE(y_mean) is the squared error when all predictions are the mean of the actual predictions