I have a list of strings which serve as labels for my classification problem (image recognition with a Convolutional Neural Network). These labels consist of 5-8 characters (numbers from 0 to 9 and letters from A to Z). To train my neural network I would like to one hot encode the labels. I wrote a code to encode one label but I am still experiencing difficulties when trying to apply the code to a list.
Here is my code for one label which works fine:
from numpy import argmax
# define input string
data = '7C24698'
print(data)
# define universe of possible input values
characters = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ '
# define a mapping of chars to integers
char_to_int = dict((c, i) for i, c in enumerate(characters))
int_to_char = dict((i, c) for i, c in enumerate(characters))
# integer encode input data
integer_encoded = [char_to_int[char] for char in data]
print(integer_encoded)
# one hot encode
onehot_encoded = list()
for value in integer_encoded:
character = [0 for _ in range(len(characters))]
character[value] = 1
onehot_encoded.append(character)
print(onehot_encoded)
# invert encoding
inverted = int_to_char[argmax(onehot_encoded[0])]
print(inverted)
I now want to get the same output for list of labels and store the output in a new list:
list_of_labels = ['7C24698', 'NDK745']
encoded_labels = []
How can I do this?
you can make a function with your working code and then use the built-in function map
to apply for each element from your lists_of_labels
your one-hot encoding function:
from numpy import argmax
# define input string
def my_onehot_encoded(data):
# define universe of possible input values
characters = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ '
# define a mapping of chars to integers
char_to_int = dict((c, i) for i, c in enumerate(characters))
int_to_char = dict((i, c) for i, c in enumerate(characters))
# integer encode input data
integer_encoded = [char_to_int[char] for char in data]
# one hot encode
onehot_encoded = list()
for value in integer_encoded:
character = [0 for _ in range(len(characters))]
character[value] = 1
onehot_encoded.append(character)
return onehot_encoded
list_of_labels = ['7C24698', 'NDK745']
encoded_labels = list(map(my_onehot_encoded, list_of_labels))