I'm trying to recognize turning points in sequences, the points after which some process behaves differently. I use a keras model to do this. Input is the sequence (always the same length) and output should be 0 before the turning points, a 1 after the turning point.
I want the loss function to depend on the distance between the actual turning point and the predicted turning point.
I tried to round (to obtain the label 0 or 1), followed by summing the total number of 1's to get the "index" of the turning point. Assumed here is that the model gives just one turning point, as the data (synthetically produced) also has just one turning point. Tried is:
def dist_loss(yTrue,yPred):
turningPointTrue = K.sum(yTrue)
turningPointPred = K.sum(K.round(yPred))
return K.abs(turningPointTrue-turningPointPred)
This does not work, the following error is given:
ValueError: An operation has
None
for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
I think this means that K.round(yPred) gives a singular value, instead of a vector/tensor. Does anyone know how to solve this issue?
The round
operation has no defined gradient, so it cannot be used at all inside a loss function, since for training of a neural network the gradient of the loss with respect to the weights has to be computed, and this implies that all the parts of the network and loss must be differentiable (or a differentiable approximation must be available).
In your case you should try to find an approximation to round that is differentiable, but unfortunately I don't know if there is one. One example of such approximation is the softmax function as approximation of the max function.