I would like to implement a custom loss function shown in this paper with Keras.
My loss is not going down and I have the feeling that it is because of the implementation of the loss: It doesn't use Keras' backend for everything but rather a combination of some K
functions, simple operations and numpy
:
def l1_matrix_norm(M):
return K.cast(K.max(K.sum(K.abs(M), axis=0)), 'float32')
def reconstruction_loss(patch_size, mask, center_weight=0.9):
mask = mask.reshape(1, *mask.shape).astype('float32')
mask_inv = 1 - mask
def loss(y_true, y_pred):
diff = y_true - y_pred
center_part = mask * diff
center_part_normed = l1_matrix_norm(center_part)
surr_part = mask_inv * diff
surr_part_normed = l1_matrix_norm(surr_part)
num_pixels = np.prod(patch_size).astype('float32')
numerator = center_weight * center_part_normed + (1 - center_weight) * surr_part_normed
return numerator / num_pixels
return loss
Is it necessary to use Keras functions, if so for which type of operations do I need it (I saw some code where simple operations such as addition don't use K
).
Also if I have to use a Keras backend function, can I instead use TensorFlows function?
NN training depends on being able to compute the derivatives of all functions in the graph including the loss function. Keras backend functions and TensorFlow functions are annotated such that tensorflow (or other backend) automatically known how to compute gradients. That is not the case for numpy functions. It is possible to use non tf functions, if you do know how to compute their gradients manually (see tf.custom_gradients
). In general, I would recommend with sticking with backend functions preferably and then tensorflow functions when necessary.