Search code examples
pythontime-seriesconv-neural-networklstmtensorflow2.0

How to combine a masked loss with tensorflow2 TimeSeriesGenerator


We are trying to use a convolutional LSTM to predict the values of an image given the past 7 timesteps. We have used the tensorflow2 TimeSeriesGenerator method to create our time series data:

    train_gen = TimeseriesGenerator(
        data,
        data,
        length=7,
        batch_size=32,
        shuffle=False
    )

Every image (timestep) has the shape (55, 50, 1), therefore the generator has produced data with the shape (32, 7, 55, 50, 1) and their targets (32, 55, 50, 1). However, there is a twist, we only want to compute the loss of a prediction for a masked region of the image. This mask is constant and we have stored it in a tensor constant in the following way:

mask = tf.keras.backend.constant(mask)

Our idea was then to give this constant as a second input to our model and use it to compute a masked loss using a custom loss function:

def masked_MSE_loss(y_true, y_pred, mask):
    y_pred_masked = tf.math.multiply(y_pred, mask)
    mse = tf.keras.losses.mean_squared_error(y_true = y_true, y_pred = y_pred_masked)
    return mse

Our model then looks like the following:

# Define the input tensors
inputs = Input(shape=(lookback, 55, 50, 1))
input_mask = Input(tensor=mask)

# First stack of convlstm layers
convlstm1 = layers.ConvLSTM2D(filters=128, kernel_size=(3, 3), padding='same', activation='tanh', return_sequences=True)(inputs)
bathnorm1 = layers.BatchNormalization()(convlstm1)
convlstm2 = layers.ConvLSTM2D(filters=128, kernel_size=(3, 3), padding='same', activation='tanh', return_sequences=False)(bathnorm1)

# Second stack of convlstm layers
convlstm3 = layers.ConvLSTM2D(filters=128, kernel_size=(3, 3), padding='same', activation='tanh', return_sequences=True)(inputs)
batchnorm2 = layers.BatchNormalization()(convlstm3)
convlstm4 = layers.ConvLSTM2D(filters=128, kernel_size=(3, 3), padding='same', activation='tanh', return_sequences=False)(batchnorm2)

# Concatenate outputs of two stacks
concatenation = layers.concatenate([convlstm2, convlstm4])
outputs = layers.Conv2D(filters=1, kernel_size=1, padding="same", activation='tanh')(concatenation)

# Create the model
model = Model(inputs=[inputs, input_mask], outputs=outputs)
model.add_loss(masked_MSE_loss(inputs, outputs, input_mask))

# Compile the model
model.compile(optimizer='adam', loss=None, metrics=['mae'])

Finally, we tried fitting the model in a rather unique way in an attempt to merge our TimeSeriesGenerator with our constant input:

for batch in train_gen:
    batch_input, batch_target = batch
    model.fit(x=[batch_input, np.repeat(mask[np.newaxis, :, :, :], len(batch), axis=0)], y=batch_target, epochs=1)

We loop over each batch and repeat the constant len(batch) times before feeding it to our network and training it for 1 epoch (each batch). This gives us the following error:

ValueError: Input 1 of layer "model" is incompatible with the layer: expected shape=(None, 50, 1), found shape=(32, 55, 50, 1)

This made us think that we needed to feed the mask constant only once:

for batch in train_gen:
    batch_input, batch_target = batch
    model.fit(x=[batch_input, mask], y=batch_target, epochs=100)

But this gave us the error:

ValueError: Data cardinality is ambiguous:
  x sizes: 32, 55
  y sizes: 32
Make sure all arrays contain the same number of samples

Clearly, the model expects the first argument to be the batch size, but it almost feels like the error contradicts the previously tried solution where we repeated the constant according to len(batch).

So our question is: How do we fit our model with a constant tensor as our second input (to compute a masked loss over our predictions) combined with our TimeSeriesGenerator data?


Solution

  • Maybe try something like this:

    import tensorflow as tf
    from tensorflow.keras import Input, layers, Model
    from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
    import numpy as np
    
    ################################ Creating Mask #################################
    mask = np.ones((55,50))
    mask = tf.keras.backend.constant(mask)
    mask = tf.expand_dims(mask, -1)
    ################################################################################
    
    lookback = 7
    batch_size = 16
    
    # Create data with shape (3653, 55, 50, 1) with 3653 timesteps
    data = np.random.random((3653, 55, 50, 1))
    
    train_gen = TimeseriesGenerator(
    data,
    data,
    length=lookback,
    batch_size=batch_size,
    shuffle=False
    )
    
    def masked_MSE_loss(y_true, y_pred):
      y_pred_masked = tf.math.multiply(y_pred, mask)
      y_true = tf.math.multiply(y_true, mask)
      mse = tf.keras.losses.mean_squared_error(y_true = y_true, y_pred = y_pred_masked)
      return mse
    
    # Define the input tensors
    inputs = Input(shape=(lookback, 55, 50, 1))
    
    # First stack of convlstm layers
    convlstm1 = layers.ConvLSTM2D(filters=128, kernel_size=(3, 3), padding='same', activation='tanh', return_sequences=True)(inputs)
    bathnorm1 = layers.BatchNormalization()(convlstm1)
    convlstm2 = layers.ConvLSTM2D(filters=128, kernel_size=(3, 3), padding='same', activation='tanh', return_sequences=False)(bathnorm1)
    
    # Second stack of convlstm layers
    convlstm3 = layers.ConvLSTM2D(filters=128, kernel_size=(3, 3), padding='same', activation='tanh', return_sequences=True)(inputs)
    batchnorm2 = layers.BatchNormalization()(convlstm3)
    convlstm4 = layers.ConvLSTM2D(filters=128, kernel_size=(3, 3), padding='same', activation='tanh', return_sequences=False)(batchnorm2)
    
    # Concatenate outputs of two stacks
    concatenation = layers.concatenate([convlstm2, convlstm4])
    outputs = layers.Conv2D(filters=1, kernel_size=1, padding="same", activation='tanh')(concatenation)
    
    model = Model(inputs=inputs, outputs=outputs)
    model.compile(optimizer='adam', loss=masked_MSE_loss, metrics=['mae'])
    model.fit(train_gen, epochs=100)
    

    Note that your mask is no longer a part of your model, since it is constant.