I am currently developing and testing a RNN that relies upon a large amount of data for training, and so have attempted to separate my training and testing files. I have one file where I create, train, and save a tensorflow.keras
model to a file 'model.keras'
I then load this model in another file and predict some values, but get the following error:
Failed to convert elements of {'class_name': '__tensor__', 'config': {'dtype': 'float64', 'value': [0.0, 0.0, 0.0, 0.0]}} to Tensor. Consider casting elements to a supported type. See https://www.tensorflow.org/api_docs/python/tf/dtypes for supported TF dtypes
By the way, I have tried running model.predict
with this exact same data in the file where I train the model, and it works smoothly. The model loading must be the problem, not the data used to predict.
This mysterious float64
tensor is the value I passed into the masking layer. I don't understand why keras is unable to recognize this JSON object as a Tensor and apply the masking operation as such. I have included snippets of my code below, edited for clarity and brevity:
model_generation.py:
# Create model
model = tf.keras.Sequential([
tf.keras.layers.Input((352, 4)),
tf.keras.layers.Masking(mask_value=tf.convert_to_tensor(np.array([0.0, 0.0, 0.0, 0.0]))),
tf.keras.layers.GRU(50, return_sequences=True, activation='tanh'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.GRU(50,activation='tanh'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(units=1, activation='sigmoid')])
# Compile Model...
# Train Model...
model.save('model.keras')
model.predict(data) # Line works here
model_testing.py
model = tf.keras.models.load_model('model.keras')
model.predict(data) # this line generates the error
EDIT:
Moved the load command into the same file as the training, still receiving the exact same error message.
That error is due to the mask_value
that you pass into tf.keras.layers.Masking
not getting serialized compatibly for deserialization. But because you masking layer is a tensor containing all 0s anyway, you can instead just pass a scalar value like this and it will eliminate the need to serialize a tensor while storing the model
tf.keras.layers.Masking(mask_value=0.0)
and it broadcasts it to effectively make it equivalent to comparing it against the tensor containing all 0s. Here is the source where the mask is applied like this
ops.any(ops.not_equal(inputs, self.mask_value), axis=-1, keepdims=True)
and ops.not_equal
supports broadcasting.