"Too early" early stopping in Keras

I'm training a neural network with Keras, using early stopping. However, when training, the network very early hits a point where the validation loss is unnaturally low, which flattens out after a while, like this. loss functions

When using early stopping with patience = 50 the validation loss flattens out, but never goes below the validation loss in the beginning.

I've trained the network multiple times with the same result, with both the rmsprop (with learning rates from 0.1 to 1e-4) and adam optimizers.

Does anyone know if there is a way to set a "burn in period" (like in a Markov Chain Monte Carlo model) for the network, before monitoring the validation loss for choosing the best model?

Solution

maybe I'm 2/3 years late, but I had the same issue, and I've solved coding this callback:

class DelayedEarlyStopping(tf.keras.callbacks.EarlyStopping):
    def __init__(self, burn_in, **kwargs):
        super(DelayedEarlyStopping, self).__init__(**kwargs)
        self.burn_in = burn_in

    def on_epoch_end(self, epoch, logs=None):
        if epoch >= self.burn_in:
            super().on_epoch_end(epoch, logs)
        else:
            super().on_train_begin(logs=None)

early_stopping_monitor = DelayedEarlyStopping(
    100,
    monitor='val_total_loss',
    min_delta=0,
    patience=20,
    verbose=0,
    mode='auto',
    baseline=40,
    restore_best_weights=True
)