Search code examples
tensorflowkerasword-embedding

Seralizing a keras model with an embedding layer


I've trained a model with pre-trained word embeddings like this:

embedding_matrix = np.zeros((vocab_size, 100))
for word, i in text_tokenizer.word_index.items():
    embedding_vector = embeddings_index.get(word)
    if embedding_vector is not None:
        embedding_matrix[i] = embedding_vector

embedding_layer = Embedding(vocab_size,
                        100,
                        embeddings_initializer=Constant(embedding_matrix),
                        input_length=50,
                        trainable=False)

With the architecture looking like this:

sequence_input = Input(shape=(50,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
text_cnn = Conv1D(filters=5, kernel_size=5, padding='same',     activation='relu')(embedded_sequences)
text_lstm = LSTM(500, return_sequences=True)(embedded_sequences)


char_in = Input(shape=(50, 18, ))
char_cnn = Conv1D(filters=5, kernel_size=5, padding='same', activation='relu')(char_in)
char_cnn = GaussianNoise(0.40)(char_cnn)
char_lstm = LSTM(500, return_sequences=True)(char_in)



merged = concatenate([char_lstm, text_lstm]) 

merged_d1 = Dense(800, activation='relu')(merged)
merged_d1 = Dropout(0.5)(merged_d1)

text_class = Dense(len(y_unique), activation='softmax')(merged_d1)
model = Model([sequence_input,char_in], text_class)

When I go to convert the model to json, I get this error:

ValueError: can only convert an array of size 1 to a Python scalar

Similarly, if I use the model.save() function, it seems to save correctly, but when I go to load it, I get Type Error: Expected Float32.

My question is: is there something I am missing when trying to serialize this model? Do I need some sort of Lambda layer or something of the sorts?

Any help would be greatly appreciated!


Solution

  • You can use the weights argument in Embedding layer to provide initial weights.

    embedding_layer = Embedding(vocab_size,
                                100,
                                weights=[embedding_matrix],
                                input_length=50,
                                trainable=False)
    

    The weights should remain non-trainable after model saving/loading:

    model.save('1.h5')
    m = load_model('1.h5')
    m.summary()
    
    __________________________________________________________________________________________________
    Layer (type)                    Output Shape         Param #     Connected to
    ==================================================================================================
    input_3 (InputLayer)            (None, 50)           0
    __________________________________________________________________________________________________
    input_4 (InputLayer)            (None, 50, 18)       0
    __________________________________________________________________________________________________
    embedding_1 (Embedding)         (None, 50, 100)      1000000     input_3[0][0]
    __________________________________________________________________________________________________
    lstm_4 (LSTM)                   (None, 50, 500)      1038000     input_4[0][0]
    __________________________________________________________________________________________________
    lstm_3 (LSTM)                   (None, 50, 500)      1202000     embedding_1[0][0]
    __________________________________________________________________________________________________
    concatenate_2 (Concatenate)     (None, 50, 1000)     0           lstm_4[0][0]
                                                                     lstm_3[0][0]
    __________________________________________________________________________________________________
    dense_2 (Dense)                 (None, 50, 800)      800800      concatenate_2[0][0]
    __________________________________________________________________________________________________
    dropout_2 (Dropout)             (None, 50, 800)      0           dense_2[0][0]
    __________________________________________________________________________________________________
    dense_3 (Dense)                 (None, 50, 15)       12015       dropout_2[0][0]
    ==================================================================================================
    Total params: 4,052,815
    Trainable params: 3,052,815
    Non-trainable params: 1,000,000
    __________________________________________________________________________________________________