Search code examples
pythonmachine-learningscikit-learnlinear-regression

Precision_score and accuracy_score showing value error


I'm new to this machine learning and using this boston dataset for predictions. Everything except the result for precision_score and accuracy_score is working fine . This is what i have done :

import pandas as pd 
import sklearn 
from sklearn.linear_model import LinearRegression
from sklearn import preprocessing,cross_validation, svm
from sklearn.datasets import load_boston
import numpy as np
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score, classification_report, confusion_matrix

boston = load_boston()
df = pd.DataFrame(boston.data)
df.columns= boston.feature_names
df['Price']= boston.target

X = np.array(df.drop(['Price'],axis=1), dtype=np.float64)
X = preprocessing.scale(X)

y = np.array(df['Price'], dtype=np.float64)

print (len(X[:,6:7]),len(y))

X_train,X_test,y_train,y_test=cross_validation.train_test_split(X,y,test_size=0.30)

clf =LinearRegression()
clf.fit(X_train,y_train)
y_predict = clf.predict(X_test)

print(y_predict,len(y_predict))
print (accuracy_score(y_test, y_predict))
print(precision_score(y_test, y_predict,average = 'macro'))

Now i get the following error:

File "LinearRegression.py", line 33, in

 accuracy = accuracy_score(y_test, y_predict)    File "/usr/local/lib/python2.7/dist-packages/sklearn/metrics/classification.py",

line 172, in accuracy_score

 y_type, y_true, y_pred = _check_targets(y_true, y_pred)

File "/usr/local/lib/python2.7/dist-packages/sklearn/metrics/classification.py", line 89, in _check_targets

 raise ValueError("{0} is not supported".format(y_type))

 ValueError: continuous is not supported

Solution

  • You are using a linear Regression model as

    clf = LinearRegression()
    

    which predicts continuous values. eg: 1.2, 1.3

    Whereas accuracy_score(y_test, y_predict) expects boolean values. 1 or 0 (true or false) or categorical values like 1,2,3,4 etc.. Where the numbers act as categories.

    That's why you are getting an error.

    How to solve this?

    Since you are trying to predict Price on boston data which is a continuous value. I recommend you change your error measure from accuracy to RMSE or MSE

    Replace:

    print(accuracy_score(y_test, y_predict))
    

    with:

    from sklearn.metrics import mean_squared_error
    print(mean_squared_error(y_test, y_predict))
    

    That will solve your problem.