I have a model with an input (batch of images w/ shape (height, width, time)) that has a dynamically sized dimension (time), which is only determined at runtime. However, the Dense
layer requires fully defined spatial dimensions. Code snippet example:
import tensorflow as tf
from tensorflow.keras.layers import Dense, Flatten, Input
# Define an input with an undefined dimension (None)
input_tensor = Input(shape=(None, 256, 256, None, 13))
# Apply a Dense layer (which expects a fully defined shape)
x = Flatten()(input_tensor)
x = Dense(10)(x)
# Build the model
model = tf.keras.models.Model(inputs=input_tensor, outputs=x)
model.summary()
This raises the error:
ValueError: The last dimension of the inputs to a Dense layer should be defined. Found None.
How can I make it work using Flatten
instead of alternatives like GlobalAveragePooling3D
? Essentially, I’m looking for a way to create a 1D array with the original pixel values, but compatible with the Dense
layer.
This is just not possible because a dense layer has a fixed number of weights. When you call a dense layer after flattening, it is effectively doing
w_0 * x_0 + w_1 * x_1 + w_2 * x_2 + .... + w_n-1 * x_n-1 + bias
where the w
s are the weights and the x
s are the flattened input feature values.
So if due to your unknown dimension, n
can't be known ahead of time, then it's just not possible for the network to be configured with the appropriate number of weights.
Even if you knew the "max time" and want to preallocate the number of weights in the network to support that, it would likely suffer from two problems
So the alternatives to capture the time axis would be to either make it a time-series network like LSTMs or recurrent neural networks, or a 3D convolution network which relies on pooling across time.