Keras ver 2.4.3
I'm creating a simple Image-Caption Model which has two inputs and an output.
The model definition code is as follows:
# Image feat part
imginp = Input(shape=(512,))
imglay1 = Dropout(0.5)(imginp)
imglay2 = Dense(EMBED_SIZE, activation=act)(imglay1)
# LSTM Part
textinp = Input(shape=(39,))
textlay1 = Embedding(VOCAB_SIZE, EMBED_SIZE, mask_zero=True)(textinp)
textlay2 = Dropout(0.5)(textlay1)
textlay3 = LSTM(EMBED_SIZE)(textlay2)
# # Decoder part that combines both
declay1 = Add()([imglay2, textlay3])
declay2 = Dense(EMBED_SIZE, activation=act)(declay1)
output = Dense(VOCAB_SIZE, activation="softmax")(declay2)
# Creating keras model
model = tf.keras.models.Model(inputs=[imginp,textinp],outputs=output)
model.summary()
The model however gives a error on model.fit()
and I noticed the Input Layer is giving a strange output which I believe is causing the error. snippet of summary
looks like this:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_82 (InputLayer) [(None, 39)] 0
__________________________________________________________________________________________________
input_81 (InputLayer) [(None, 512)] 0
__________________________________________________________________________________________________
embedding_31 (Embedding) (None, 39, 300) 511800 input_82[0][0]
__________________________________________________________________________________________________
dropout_79 (Dropout) (None, 512) 0 input_81[0][0]
As you see the output shape for the input layers need to (None, 512)
and (None, 39)
but however they seem to be a list. And hence, I'm getting a ValueError: no grad available for the variables
though I did test the python data-generator. I believe this Input
layer api is causing some strange error.
Any Ideas ?
I would like to point out two things that cause this discrepancy -
(1) running the code locally vs on GoogleCollab causes the difference, where if the block of code is run on Collab, it throws an error at this instance saying gradients cannot propagate. So, it seems to be an issue with the Collab.
(2) Also, another quirkiness in Collab is that if there are functions written in py files that are imported into Collab env and then called from the notebook as apis then this error pops up. It is not related to the Keras ver but more related to underlying behavior of Collab runtine which I could not debug.
Glad that I would figure and test this out. For more details, please comment and I will try my best to answer.