Search code examples
pythontensorflowkerasgpu

How to clean GPU memory when loading another dataset


I'm training the CNN network on audio spectrograms comparing 2 types of input data (3 seconds and 30 seconds). This results in different spectrogram sizes in experiments.

I'm using this to get data:

def get_data(data_type, batch_size):
    assert data_type in ['3s', '30s'], "data_type shoulbe either 3s or 30s"
    if data_type == '3s':
        audio_dir = DATA_PATH / 'genres_3_seconds'
        max_signal_length_to_crop = 67_500
    elif data_type == '30s':
        audio_dir = DATA_PATH / 'genres_original'
        max_signal_length_to_crop = 660_000
    input_shape = (max_signal_length_to_crop, 1)

    train_ds, val_ds = tf.keras.utils.audio_dataset_from_directory(
        directory=audio_dir,
        batch_size=batch_size,
        validation_split=0.2,
        output_sequence_length=max_signal_length_to_crop,
        subset='both',
        label_mode='categorical'
    )
    test_ds = val_ds.shard(num_shards=2, index=0)
    val_ds = val_ds.shard(num_shards=2, index=1)
    return train_ds, val_ds, test_ds, input_shape

I'm using this function to create models.

def get_model(model_type, data_type, input_shape):
    if data_type == '3s':
        WIN_LENGTH = 1024 * 2
        FRAME_STEP = int(WIN_LENGTH / 4)  # / 4 a nie /2

    elif data_type == '30s':
        WIN_LENGTH = 1024 * 4
        FRAME_STEP = int(WIN_LENGTH / 2)  # / 4 a nie /2
    specrtogram_layer = 
    kapre.composed.get_melspectrogram_layer(input_shape=input_shape, win_length=WIN_LENGTH, hop_length=FRAME_STEP)
    model = Sequential([
        specrtogram_layer,
        *model_dict[model_type],
        Dense(units=10, activation='softmax', name='last_dense')
    ])
    model.compile(
        optimizer=tf.keras.optimizers.Adam(learning_rate=START_LR),
        loss=tf.keras.losses.CategoricalCrossentropy(),
        metrics=['accuracy'],
    )
    return model
model_dict = {
    'CNN_Basic': [
        Conv2D(filters=8, kernel_size=3, activation='relu'),
        MaxPooling2D(2),
        Conv2D(filters=16, kernel_size=3, activation='relu'),
        MaxPooling2D(2),
        Conv2D(filters=32, kernel_size=3, activation='relu'),
        MaxPooling2D(2),
        Flatten(),
        Dense(units=128, activation='relu'),
    ],
    ...
}

I'm running several experiments on different architectures in a loop. This is my training loop:

for data_type in ['3s', '30s']:
    train_ds, val_ds, test_ds, input_shape = get_data(data_type=data_type, batch_size=30)
    for model_type in ['CNN_Basic', ...]:
        model = get_model(model_type, input_shape=input_shape, data_type=data_type)
        model.fit(train_ds, epochs=epochs, validation_data=val_ds)

The error I get:

Traceback (most recent call last):
  File "...\lib\site-packages\tensorflow\python\trackable\base.py", line 205, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "...\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "...\lib\site-packages\tensorflow\python\framework\ops.py", line 1969, in _create_c_op
    raise ValueError(e.message)
ValueError: Exception encountered when calling layer "dense" (type Dense).

Dimensions must be equal, but are 17024 and 6272 for '{{node dense/MatMul}} = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false](Placeholder, dense/MatMul/ReadVariableOp)' with input shapes: [?,17024], [6272,128].

Call arguments received by layer "dense" (type Dense):
  • inputs=tf.Tensor(shape=(None, 17024), dtype=float32)

I think it's caused by something with the datasets because I got this error only when I ran an experiment with a 3-second spectrogram after the 30-second one. I'm creating new models each time, and to load the data I use tf.keras.utils.audio_dataset_from_directory and load it to the same variable in the following loop iterations.


Solution

  • Ok I got it. You cannot create a model or part of a model like in mine example and expect it to work as brand new when taking it from dictionary. I fixed it by creating a function that is called each time I build new model, and returns me new instances of layers. Using dictionary I got the "right" layers, but they were used one, and after fitting they changed their state in memory, and were expecting certain type of input data, when I was trying to run it once more on another set of data.

    So lesson learned, don't create variables (lists, dicts, etc.) with your layers when you intend to reuse them in different models. Below is the snipped with fixed code.

    def get_internal_model(model_type):
        if model_type == 'CNN_Basic':
            internal_layers = [
                Conv2D(filters=8, kernel_size=3, activation='relu'),
                MaxPooling2D(2),
                Conv2D(filters=16, kernel_size=3, activation='relu'),
                MaxPooling2D(2),
                Conv2D(filters=32, kernel_size=3, activation='relu'),
                MaxPooling2D(2),
                Flatten(),
                Dense(units=128, activation='relu'),
            ]
        ...
        return internal_layers
    
    
    def get_model(model_type, data_type, input_shape):
        specrtogram_layer = get_spectrogram_layer(input_shape, data_type)
        model = Sequential([
            specrtogram_layer,
            *get_internal_model(model_type), # HERE
            Dense(units=10, activation='softmax', name='last_dense')
        ])
        ...