First of, I am new to stackoverflow, so if there is a way to improve the way I formulate my question or if I missed something obvious, do point it out to me please!
I am building a classification convolutional network in Keras, where the network is asked to predict parameter was used to generate the image. The classes are encoded in 5 float values, e.g. a list of the classes may look like this:
[[0.], [0.76666665], [0.5], [0.23333333], [1.]]
I want to one-hot encode these classes, using the keras.utils.to_categorical(y, num_classes=5, dtype='float32')
function.
However, it returns the following:
array(
[
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.]
],
dtype=float32)
It only takes integers as input, thus it maps all values < 1.
to 0
.
I could circumvent this by multiplying all values with a constant so they are all integers and I think there is also a way to solve this problem within scikit learn, but that sounds like a huge work-around for a problem that should be trivial to solve within just keras, which makes me believe I am missing something obvious.
I hope somebody is able to point out a simple alternative using just Keras.
Due to the continuous nature of floating point values, it's not advisable to try and one hot encode them. Instead, you should try something like this:
a = {}
classes = []
for item, i in zip(your_array, range(len(your_array))):
a[str(i)] = item
classes.append(str(i))
encoded_classes = to_categorical(classes)
The dictionary is so that you can refer to actual values later.
EDIT: Updated after comment from nuric.
your_array = [[0.], [0.76666665], [0.5], [0.23333333], [1.]]
class_values = {}
classes = []
for i, item in enumerate(your_array):
class_values[str(i)] = item
classes.append(i)
encoded_classes = to_categorical(classes)