How do masked values affect the metrics in Keras?

If I look into keras metric I see that the values of y_true and y_predict are "just" compared at the end of each epoch for categorical_accuracy:

def categorical_accuracy(y_true, y_pred):
    return K.cast(K.equal(K.argmax(y_true, axis=-1),
                          K.argmax(y_pred, axis=-1)),
                  K.floatx())

How are masked values handled? If I understood correctly, masking prohibits the masked values to influence the training, but it still produces predictions for the masked values. Thereby, it does, in my opinion, influence the metric.

More explanation on how it influences the metric:

In the padding/masking process, I set the padded/masked values in y_true to an unused class e.g. class 0. If now argmax() is looking for a max value in the one-hot encoded y_true, it will just return 0 as the total (masked) row is the same. I do not have a class 0, as it is my masking value/class, and thereby the y_pred and y_true certainly have different values creating a reduced accuracy.

Is this somehow already thought of in the Keras metric and I oversaw it? Otherwise, I would have to create a custom metric or callback creating a similar metric to categorical_accuracy with the addition that all masked values are eliminated in y_pred and y_true before comparison.

Solution

Maybe the best answer would be this from Keras.metrics :

A metric function is similar to a loss function, except that the results from evaluating a metric are not used when training the model.

The training is only influenced by the loss function where masking is implemented. Nevertheless, your displayed results are not on par with the actual results and can lead to misleading conclusions.

As the metric is not used in the training process, a callback function can solve this.

something like this (based on Andrew Ng). I Search for 0 here as for my masked target all one-hot-encoded targets are 0 (No class activated).

import numpy as np
from keras.callbacks import Callback
from sklearn.metrics import accuracy_score

class categorical_accuracy_no_mask(Callback):

   def on_train_begin(self, logs={}):
       self.val_acc = []

   def on_epoch_end(self, epoch, logs={}):
       val_predict = (np.asarray(self.model.predict(self.model.validation_data[0]))).round()
       val_targ = self.model.validation_data[1] 
       indx = np.where(~val_targ.any(axis=2))[0] #find where all targets are zero. That are the masked once as we masked the target with 0 and the data with 666
       y_true_nomask = numpy.delete(val_targe, indx, axis=0)
       y_pred_nomask = numpy.delete(val_predict, indx, axis=0)

       _val_accuracy = accuracy_score(y_true_nomask, y_pred_nomask)
       self.val_acc.append(_val_accuracy)

       print “ — val_accuracy : %f ” %( _val_accuracy )
       return

Of course, now you could add also precision-recall etc.