Search code examples
pythontensorflowkerasloss-function

TensorFlow session inside Keras custom loss function


After going through some Stack questions and the Keras documentation, I manage to write some code trying to evaluate the gradient of the output of a neural network w.r.t its inputs, the purpose being a simple exercise of approximating a bivariate function (f(x,y) = x^2+y^2) using as loss the difference between analytical and automatic differentiation.

Combining answers from two questions (Keras custom loss function: Accessing current input pattern and Getting gradient of model output w.r.t weights using Keras ), I came up with this:

import tensorflow as tf
from keras import backend as K
from keras.models import Model
from keras.layers import Dense, Activation, Input

def custom_loss(input_tensor):

    outputTensor = model.output       
    listOfVariableTensors = model.input      
    gradients = K.gradients(outputTensor, listOfVariableTensors)

    sess = tf.InteractiveSession()
    sess.run(tf.initialize_all_variables())
    evaluated_gradients = sess.run(gradients,feed_dict={model.input:input_tensor})

    grad_pred = K.add(evaluated_gradients[0], evaluated_gradients[1])
    grad_true = k.add(K.scalar_mul(2, model.input[0][0]), K.scalar_mul(2, model.input[0][1])) 

    return K.square(K.subtract(grad_pred, grad_true))

input_tensor = Input(shape=(2,))
hidden = Dense(10, activation='relu')(input_tensor)
out = Dense(1, activation='sigmoid')(hidden)
model = Model(input_tensor, out)
model.compile(loss=custom_loss_wrapper(input_tensor), optimizer='adam')

Which yields the error: TypeError: The value of a feed cannot be a tf.Tensor object. because of feed_dict={model.input:input_tensor}. I understand the error, I just don't know how to fix it.

From what I gathered, I can't simply pass input data into the loss function, it must be a tensor. I realized Keras would 'understand' it when I call input_tensor. This all just leads me to think I'm doing things the wrong way, trying to evaluate the gradient like that. Would really appreciate some enlightenment.


Solution

  • I don't really understand why you want this loss function, but I will provide an answer anyway. Also, there is no need to evaluate the gradient within the function (in fact, you would be "disconnecting" the computational graph). The loss function could be implemented as follows:

    from keras import backend as K
    from keras.models import Model
    from keras.layers import Dense, Input
    
    def custom_loss(input_tensor, output_tensor):
        def loss(y_true, y_pred):
            gradients = K.gradients(output_tensor, input_tensor)
            grad_pred = K.sum(gradients, axis=-1)
            grad_true = K.sum(2*input_tensor, axis=-1)
            return K.square(grad_pred - grad_true)
        return loss
    
    input_tensor = Input(shape=(2,))
    hidden = Dense(10, activation='relu')(input_tensor)
    output_tensor = Dense(1, activation='sigmoid')(hidden)
    model = Model(input_tensor, output_tensor)
    model.compile(loss=custom_loss(input_tensor, output_tensor), optimizer='adam')