python machine-learning deep-learning nlp lstm

Why does model.fit produce "is incompatible with the layer"?

This is my LSTM model:

lstm_out1 = 150
embed_dim = 768

model = Sequential()
model.add(Embedding(embed_tensor.shape[0], embed_dim, weights=[embed_tensor], input_length=512, trainable=False))
model.add(LSTM(lstm_out1, dropout=0.2, recurrent_dropout=0.2))

model.add(Dense(64, activation='relu'))

model.add(Dense(1, activation='sigmoid'))

adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)

model.compile(loss='binary_crossentropy',
                  optimizer=adam,
                  metrics=['accuracy'])

model.summary()

Model summary is:

Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 embedding_12 (Embedding)    (None, 512, 768)          91812096  
                                                                 
 lstm_10 (LSTM)              (None, 150)               551400    
                                                                 
 dense_16 (Dense)            (None, 64)                9664      
                                                                 
 dense_17 (Dense)            (None, 1)                 65

I called model.fit() on tis:

model.fit(token_tensor, labels, batch_size=10, epochs=1, shuffle=True)

It's producing error:

  ValueError: Input 0 of layer "sequential_8" is incompatible with the layer: expected shape=(None, 512), found shape=(10, 1, 512)

My each input vector is size of 512. the 10 in found shape is the batch size. Plz help me to resolve this.

Solution

As the error indicates, your input tensor shape of (10, 1, 512) is not compatible with the declared input of the model, which has shape (None, 512). The "None" in the input shape indicates that the size for that axis is arbitrary.

To fix your input, you need to remove the second dimension (the "1"), after which it will be compatible with the model. Depending on the exact type of token_tensor, you could achieve this by

token_tensor = token_tensor[:, 0, :]

token_tensor = token_tensor.squeeze()

token_tensor = token_tensor.reshape((-1, token_tensor.shape[-1]))