Search code examples
pythonscikit-learnsklearn-pandas

Calculating accuracy scores of predicted continuous values


from sklearn.metrics import accuracy_score
accuracy_score(y_true, y_pred)

I believe this code will return the accuracy of our predictions. However, I am comparing predicted and actual values of continuous values and I believe that most of them are not going to be exactly same.

Should I fit the test set values and plot the predicted values to get the R-squared?

Can anyone please advise me on how to measure the accuracy of predictions in the case of continuous variables?


Solution

  • In machine learning, accuracy is defined for discrete values (classes). Its defined as the fraction of correct predictions from total predictions made.

    So, a prediction of value 319 where true value is 320 is still an incorrect prediction.

    So its not advised to calculate accuracy for continuous values. For such values you would want to calculate a measure of how close the predicted values are to the true values. This task of prediction of continuous values is known as regression. And generally R-squared value is used to measure the performance of the model.

    You can use r2_score(y_true, y_pred) for your scenario.

    There are various metrics for regression tasks (continuous variables prediction) like:-

    • Mean squared error,
    • Mean absolute error,
    • Variance score, etc

    You can get more info about the sklearn implementation of these metrics here.