I'm training a LSTM next character / word predictor in Keras, and want to include it in an iOS project. When I convert it to CoreML, the output shape and values doesn't match my original Keras model.
To summarize my question:
The model I train has the following layout:
model = Sequential()
model.add(LSTM(128, input_shape=(SEQUENCE_LENGTH, len(chars))))
model.add(Dense(len(chars), activation = 'softmax'))
model.add(Activation('softmax'))
Where a sequence is a list of characters of length 40 (sequence_length
) and chars
a list of possible characters. In this case, 31. So, the output shape of the model is (None,31)
If i try to convert the model using
coreml_model = coremltools.converters.keras.convert(
'keras_model.h5',
input_names=['sentence'],
output_names=['chars'],
class_labels = chars)
I get the following error:
NSLocalizedDescription = "The size of the output layer 'characters' in the neural network does not match the number of classes in the classifier.";
I guess this makes sense, since the output shape has a None-dimension.
If I don't supply the class_labels
argument, it converts the model just fine. However, when running result = coreml_model.predict()
, I now get an output matrix of (40,31)
instead of a single list of 31 character probabilites.
None of the entries in the results matches the values from the Keras model. The only the first entry has unique values for each character - all later entries have the exact same values.
The CoreML model output layer has the following metadata:
output {
name: "characters"
shortDescription: "Next predicted character"
type {
multiArrayType {
shape: 31
dataType: DOUBLE
}
}
}
Thank you very much for helping!
The error was in CoreML's incompatibility with multi-dimensional inputs. I found this blog, which guided me in the right direction.
So to fix it, I had to flatten the input by adding a Reshape layer, and resize the input training data to a single vector. The new model looks like this:
# Input is now a single vector of length 1240
input_shape = (SEQUENCE_LENGTH*len(chars))
model = Sequential()
# The reshape layer makes sure that I don't have to change anything inside the layers.
model.add(Reshape((SEQUENCE_LENGTH, len(chars)), input_shape=(input_shape,)))
model.add(LSTM(128, input_shape=(SEQUENCE_LENGTH, len(chars))))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))
All input vectors has to be resized the same way:
x = x.reshape(x.shape[0], input_shape)