there is a problem in running code hi,
pop out an error and can not generate the result, Error: ValueError: Input 2 is incompatible with layer model_2: expected shape=(None, 50), found shape=(None, 51)
so is there any solution for? Much obliged
the full part that triggers the bug droped below full error:
ValueError: in user code:
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:1478 predict_function *
return step_function(self, iterator)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:1468 step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
/opt/conda/lib/python3.7/site-packages/tensorflow/python/distribute/tpu_strategy.py:540 run
return self.extended.tpu_run(fn, args, kwargs, options)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/distribute/tpu_strategy.py:1296 tpu_run
return func(args, kwargs)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/distribute/tpu_strategy.py:1364 tpu_function
xla_options=tpu.XLAOptions(use_spmd_for_xla_partitioning=False))
/opt/conda/lib/python3.7/site-packages/tensorflow/python/tpu/tpu.py:968 replicate
xla_options=xla_options)[1]
/opt/conda/lib/python3.7/site-packages/tensorflow/python/tpu/tpu.py:1439
It looks like during training it's is dropping last elements of train_data['decoder_inputs_ids']
and train_data['decoder_attention_mask']
while during prediction, it's not.
model.fit(x=[train_data['input_ids'],
train_data['attention_mask'],
train_data['decoder_inputs_ids'][:,:-1],
train_data['decoder_attention_mask'][:,:-1]],
pred = model.predict([input_ids, attention_mask, decoder_inputs_ids, decoder_attention_mask])
That's why during inference it has dimension (None, 51) instead of found shape=(None, 50).
You can pad decoder_inputs_ids
and decoder_attention_mask
to max_len_sum-1
(instead of max_len_sum
) during prediction:
#Pad sequence to max_len_sum-1 (instead of original max_len_sum).
decoder_inputs_ids = tf.keras.preprocessing.sequence.pad_sequences([decoder_input_ids[:-1]], maxlen=
max_len_sum-1, padding= 'post', truncating='post')
decoder_attention_mask = tf.keras.preprocessing.sequence.pad_sequences([decoder_attention_mask[:-1]], maxlen=
max_len_sum-1, padding= 'post', truncating='post')