I have been puzzling over this one hot encoding problem. I am sure it is a simple process, but I have been looking at this problem for awhile and cannot see my mistake.
I have a set of train_labels of shape (1080,1), and there are 6 integer classes. I am trying to turn this into a one hot vector using the following:
def convert_to_one_hot(train_labels_conv,classes):
Y_train = np.eye(classes)[train_labels_conv.reshape(-1)].T
return Y_train
Y_train = np.arange(6)
print(Y_train)
Y_train_hot = convert_to_one_hot(Y_train, len(Y))
print(Y_train_hot)
As a result I simply get
[0 1 2 3 4 5]
[[1. 0. 0. 0. 0. 0.]
[0. 1. 0. 0. 0. 0.]
[0. 0. 1. 0. 0. 0.]
[0. 0. 0. 1. 0. 0.]
[0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 1.]]
Should I not have received the whole one hot matrix for my training labels? I would appreciate any instruction in the right direction as I am not yet comfortable using python.
If your labels are strings, you can use this function:
import numpy as np
target = np.array(['dog', 'dog', 'cat', 'cat', 'cat', 'dog', 'dog',
'cat', 'cat', 'hamster', 'hamster'])
def one_hot(array):
unique, inverse = np.unique(array, return_inverse=True)
onehot = np.eye(unique.shape[0])[inverse]
return onehot
print(one_hot(target))
Out[9]:
[[0., 1., 0.],
[0., 1., 0.],
[1., 0., 0.],
[1., 0., 0.],
[1., 0., 0.],
[0., 1., 0.],
[0., 1., 0.],
[1., 0., 0.],
[1., 0., 0.],
[0., 0., 1.],
[0., 0., 1.]])