Why does saving and reloading a 1D convolutional keras model cause it to fail to generalize to wider windows?

So I'm working with the example notebook that tensorflow provides to detail working with time-series formatted data.

https://www.tensorflow.org/tutorials/structured_data/time_series

Everything is going fine, just had a quick question on saving and loading models. For my current research, I need to be able to be able to train a model, save it, and then be able to reload it for testing at a later time.

The entire code for the notebook can be found at the link above, but essentially the training and compiling process involves the following method, where a model and window shape are fed into

MAX_EPOCHS = 20

def compile_and_fit(model, window, patience=2):
  early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss',
                                                    patience=patience,
                                                    mode='min')

  model.compile(loss=tf.keras.losses.MeanSquaredError(),
                optimizer=tf.keras.optimizers.Adam(),
                metrics=[tf.keras.metrics.MeanAbsoluteError()])

  history = model.fit(window.train, epochs=MAX_EPOCHS,
                      validation_data=window.val,
                      callbacks=[early_stopping])
  return history

The model in question looks like

conv_model = tf.keras.Sequential([
    tf.keras.layers.Conv1D(filters=32,
                           kernel_size=(CONV_WIDTH,),
                           activation='relu'),
    tf.keras.layers.Dense(units=32, activation='relu'),
    tf.keras.layers.Dense(units=1),
])

In the notebook, this is essentially the process that runs the training/compiling method and tests to see if it evaluates

history = compile_and_fit(conv_model, conv_window)

IPython.display.clear_output()
val_performance['Conv'] = conv_model.evaluate(conv_window.val)
performance['Conv'] = conv_model.evaluate(conv_window.test, verbose=0)

After this it is tested on a wider window in the following procedure

wide_window = WindowGenerator(
    input_width=24, label_width=24, shift=1,
    label_columns=['T (degC)'])

print("Wide window")
print('Input shape:', wide_window.example[0].shape)
print('Labels shape:', wide_window.example[1].shape)
print('Output shape:', conv_model(wide_window.example[0]).shape)

This part works fine, but if I add in two lines to save and reload the model as shown

history = compile_and_fit(conv_model, conv_window)
conv_model.save('test.keras')
conv_model = tf.keras.models.load_model('test.keras')
IPython.display.clear_output()
val_performance['Conv'] = conv_model.evaluate(conv_window.val)
performance['Conv'] = conv_model.evaluate(conv_window.test, verbose=0)

and then run

print("Wide window")
print('Input shape:', wide_window.example[0].shape)
print('Labels shape:', wide_window.example[1].shape)
print('Output shape:', conv_model(wide_window.example[0]).shape)

I receive the following error.

Wide window
Input shape: (32, 24, 19)
Labels shape: (32, 24, 1)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[58], line 4
      2 print('Input shape:', wide_window.example[0].shape)
      3 print('Labels shape:', wide_window.example[1].shape)
----> 4 print('Output shape:', conv_model(wide_window.example[0]).shape)

File ~/py38-env/lib/python3.8/site-packages/keras/src/utils/traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File ~/py38-env/lib/python3.8/site-packages/keras/src/engine/input_spec.py:298, in assert_input_compatibility(input_spec, inputs, layer_name)
    296 if spec_dim is not None and dim is not None:
    297     if spec_dim != dim:
--> 298         raise ValueError(
    299             f'Input {input_index} of layer "{layer_name}" is '
    300             "incompatible with the layer: "
    301             f"expected shape={spec.shape}, "
    302             f"found shape={display_shape(x.shape)}"
    303         )

ValueError: Input 0 of layer "sequential_3" is incompatible with the layer: expected shape=(None, 3, 19), found shape=(32, 24, 19)

This occurs when I save and reload using the .h5 file format as well. Even if I change the name and try again it still throws an error. Note that the window size it is trained on is

CONV_WIDTH = 3
conv_window = WindowGenerator(
    input_width=CONV_WIDTH,
    label_width=1,
    shift=1,
    label_columns=['T (degC)'])

but it should generalize to wider windows, and indeed does when the model is not saved and reloaded.

Any insight into why this is occurring would be greatly appreciated, thanks!

Solution

It seems that after saving, the input_shape of the model is not flexible anymore.
You can see the expected input shape with print(conv_model.input_shape).

If we go through the training, we can see how the input_shape changes. Note that this is without saving at the moment:

conv_model = tf.keras.Sequential([...])

After model creation, conv_model has no .input_shape, as there is no Input layer nor a input_shape=(...) parameter for the first (conv1d) layer. The model got no data yet and has no idea what input shape to expect.

print('Output shape:', conv_model(wide_window.example[0]).shape)

Now the model got data, and we get conv_model.input_shape=(32, 3, 19). This is a very explicit shape, as normally the first dimension, the batch dimension, would be None, indicating a flexible shape for this dimension. That is because often the last batch is not guaranteed to be of length 32 (with batch_size=32), but could be the the remainder of the data.

 history = compile_and_fit(conv_model, conv_window)

Now the model gets the full data, with different batch lengths, and we get conv_model.input_shape=(None, 3, 19). The window length and features are still fixed, as they are the same for each step, but now the first dimension got flexible in shape.

print('Output shape:', conv_model(wide_window.example[0]).shape)

If we give the model the wide_window as input, the input shape changes again: conv_model.input_shape=(None, None, 19). Now the time axis got flexible too, as the previous value 3 didn't fit anymore. Note that this only works because the model had no input_shape to begin with. If you add a tf.keras.layers.Input(3, 19) layer to the Sequential model, the same error will occure as in your question.

When you save and load the model after training, it seems that the shape (None, 3, 19) is made fix, as it would be fix when you set the input shape yourself with e.g. an Input layer.
The only (important) difference between the loaded and the normal model is the .input_spec attribute:

conv_model.input_spec = None
loaded_conv_model.input_spec = [InputSpec(shape=(None, 3, 19), ndim=3)]

If you set the attribute back to None (loaded_conv_model.input_spec=None), it now works with flexible input again, but this seems a bit hacky. If you know that you'll work with flexible time axis data (not all sequences are the same length), you can set it directly in the model:

conv_model = tf.keras.Sequential([
    tf.keras.layers.Input((None, 19)),  # batch dimension gets omitted here
    ...  # rest of the model
])

Now the model got the input shape conv_model.input_shape=(None, None, 19) and is fine with different sized batches and window lengths.