For reference the full error is here:
WARNING:tensorflow:Model was constructed with shape (None, 65536) for input KerasTensor(type_spec=TensorSpec(shape=(None, 65536), dtype=tf.float32, name='input_1'), name='input_1', description="created by layer 'input_1'"), but it was called on an input with incompatible shape (None, 65536, None).
I am using kymatio
to classify audio signals. Before constructing the model I use tensorflow's tf.keras.utils.audio_dataset_from_directory
to create the training and testing sets.
The audio samples are of shape (65536,)
before the sets are created. To create the sets I use the following code:
T = 2**16
J = 8
Q = 12
log_eps = 1e-6
SEED = 42
train_dataset = tf.keras.utils.audio_dataset_from_directory(
'../train',
labels='inferred',
label_mode='int',
class_names=['x', 'y', 'z', 'xy', 'xz', 'yz', 'xyz'],
batch_size=32,
output_sequence_length=T,
ragged=False,
shuffle=True,
seed=SEED,
follow_links=False
)
The element_spec
of the train_dataset
is (TensorSpec(shape=(None, 65536, None), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.int32, name=None))
.
So at some point the shape is changing in the TensorSpec
to (None, 65536, None)
for some reason...
The model is constructed as follows and the error points to model.fit(...)
.
x_in = layers.Input(shape=(T))
x = Scattering1D(J, Q=Q)(x_in)
x = layers.Lambda(lambda x: x[..., 1:, :])(x)
x = layers.Lambda(lambda x: tf.math.log(tf.abs(x) + log_eps))(x)
x = layers.GlobalAveragePooling1D(data_format='channels_first')(x)
x = layers.BatchNormalization(axis=1)(x)
x_out = layers.Dense(7, activation='softmax')(x)
model = tf.keras.models.Model(x_in, x_out)
model.summary()
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(train_dataset, epochs=50)
Check the docs regarding tf.keras.utils.audio_dataset_from_directory
:
[...] audio has shape (batch_size, sequence_length, num_channels)
Just use tf.squeeze
to remove the additional dimension if you are only working on single channel audios:
train_dataset = train_dataset.map(lambda x, y: (tf.squeeze(x, axis=-1), y))
If you want to keep the dimension, try:
x_in = layers.Input(shape=(T, 1))
I would recommend going through this tutorial.