Hey there I'm using Label Encoder
and Onehotencoder
in my machine learning project sample but an error appeared while executing the code at the part where Onehotencoder
executed and the error was Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
and my feature column has only two attributes Negative
or Positive
.
What does this error message mean and how do I fix it
#read data set from excel
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dataset = pd.read_csv('diab.csv')
feature=dataset.iloc[:,:-1].values
lablel=dataset.iloc[:,-1].values
#convert string data to binary
#transform sting data in lablel column to decimal/binary 0 /1
from sklearn.preprocessing import LabelEncoder,OneHotEncoder
lab=LabelEncoder()
lablel=lab.fit_transform(lablel)
onehotencoder=OneHotEncoder()
lablel=onehotencoder.fit_transform(lablel).toarray()
#create trainning model and test it
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(feature,lablel,test_size=0.30)
#fitting SVM to trainnong set
from sklearn.svm import SVC
classifier=SVC(kernel='linear',random_state=0)
classifier.fit(x_train,y_train)
y_pred=classifier.predict(x_test)
#making the confusion matrix
from sklearn.metrics import confusion_matrix
cm=confusion_matrix(y_test, y_pred)
from sklearn.neighbors import KNeighborsClassifier
my_classifier=KNeighborsClassifier()
my_classifier.fit(x_train,y_train)
prediction=my_classifier.predict(x_test)
print(prediction)
from sklearn.metrics import accuracy_score
print (accuracy_score(y_test,prediction))
plot=plt.plot((prediction), 'b', label='GreenDots')
plt.show()
I suspect the issue is that you have 2 possible labels and are treating them as separate values. The output of an SVM is usually a single value, so your labels need to be a single value for each sample. Instead of mapping the labels to one hot vectors, instead just use a single value of 1
when the label is positive and a value of 0
when the label is negative.