Search code examples
pythonkerascoreml

Output of converted CoreML model different than Keras


I'm training a LSTM next character / word predictor in Keras, and want to include it in an iOS project. When I convert it to CoreML, the output shape and values doesn't match my original Keras model.

To summarize my question:

  • Why does my converted model have different output shape than the original model, and how can I make sure they match?
  • Why do I get different prediction values from the converted model?

The model I train has the following layout:

model = Sequential()
model.add(LSTM(128, input_shape=(SEQUENCE_LENGTH, len(chars))))
model.add(Dense(len(chars), activation = 'softmax'))
model.add(Activation('softmax'))

Where a sequence is a list of characters of length 40 (sequence_length) and chars a list of possible characters. In this case, 31. So, the output shape of the model is (None,31)

If i try to convert the model using

coreml_model = coremltools.converters.keras.convert(
               'keras_model.h5', 
               input_names=['sentence'], 
               output_names=['chars'], 
               class_labels = chars)

I get the following error:

NSLocalizedDescription = "The size of the output layer 'characters' in the neural network does not match the number of classes in the classifier.";

I guess this makes sense, since the output shape has a None-dimension.

If I don't supply the class_labels argument, it converts the model just fine. However, when running result = coreml_model.predict(), I now get an output matrix of (40,31) instead of a single list of 31 character probabilites.

None of the entries in the results matches the values from the Keras model. The only the first entry has unique values for each character - all later entries have the exact same values.

The CoreML model output layer has the following metadata:

output {
  name: "characters"
  shortDescription: "Next predicted character"
  type {
    multiArrayType {
      shape: 31
      dataType: DOUBLE
    }
  }
}

Thank you very much for helping!


Solution

  • The error was in CoreML's incompatibility with multi-dimensional inputs. I found this blog, which guided me in the right direction.

    So to fix it, I had to flatten the input by adding a Reshape layer, and resize the input training data to a single vector. The new model looks like this:

    # Input is now a single vector of length 1240
    input_shape = (SEQUENCE_LENGTH*len(chars))
    model = Sequential()
    # The reshape layer makes sure that I don't have to change anything inside the layers.
    model.add(Reshape((SEQUENCE_LENGTH, len(chars)), input_shape=(input_shape,)))
    model.add(LSTM(128, input_shape=(SEQUENCE_LENGTH, len(chars))))
    model.add(Dense(len(chars)))
    model.add(Activation('softmax'))
    

    All input vectors has to be resized the same way:

    x = x.reshape(x.shape[0], input_shape)