Search code examples
pythonkerasloss-function

Keras custom loss with external data


Hello? I'm trying to write custom loss function with external data.

My code is like that:

classes_distributions = K.variable(np.full((A, A), 1.0 / A, dtype=float))

def my_loss(distributions):
    def loss(y_true, y_pred):
        x = math_ops.argmax(y_pred, axis=0)
        for i in range(A):
            classes_slice = distributions[x[i], :]
            _class = math_ops.add(classes_slice, K.flatten(y_true[:, i]))
            _class = math_ops.multiply(_class, 1.0 / K.sum(_class))
            classes_slice.assign(_class)
        # Mean enthropy of classes.
        log_p = math_ops.log(distributions)
        H = - math_ops.multiply(distributions, log_p)
        H = K.sum(H, axis=1)
        H = K.mean(H)
        return tf.fill((A,), H)
    return loss

model = Sequential()
model.add(LSTM(units=A*10, input_shape=(A, 1)))
model.add(Dense(units=A, activation='softmax'))
model.compile(loss=my_loss(classes_distributions))
history = model.fit(
    data, 
    data, # The main idea is using loss-function only
    epochs=200,
    verbose=True, 
    batch_size=A)

But I have error:

ValueError: in user code:

...

ValueError: No gradients provided for any variable: (['lstm_69/lstm_cell_69/kernel:0', 'lstm_69/lstm_cell_69/recurrent_kernel:0', 'lstm_69/lstm_cell_69/bias:0', 'dense_69/kernel:0', 'dense_69/bias:0'],). Provided `grads_and_vars` is ((None, ), (None, ), (None, ), (None, ), (None, )).

I understand, that problem in return tf.fill((A,), H), but have no idea, where is mistake.

Maybe it is some dimentional problem, because if i replace return tf.fill((A,), H) with return K.mean(K.square(y_true - y_pred), axis=-1) (from tensorflow sample code), it's OK.

But

tf.print(tf.shape(K.mean(K.square(y_true - y_pred), axis=-1)))
tf.print(tf.shape(tf.fill((A,), H)))

returns

[35]
[35]

And this is the same shape!


Solution

  • Thank's to Dr. Snoopy in comments.

    O = tf.zeros(tf.shape(y_pred))
    O = math_ops.multiply(O, y_pred)
    O = math_ops.add(O, distributions)
    log_p = math_ops.log(O)
    H = - math_ops.multiply(O, log_p)
    H = K.sum(H, axis=1)
    H = K.mean(H)
    return tf.fill((A,), H)
    

    solves my problem.

    Funny, because it is multiplying by zero and adding the answer for connecting y_pred with output.

    And

    O = math_ops.multiply(tf.zeros(tf.shape(y_pred)), y_pred)
    O = K.mean(O)
    # Some math with scalar-tensor output loss.
    return loss + O  # Corret here for batch_size = tf.shape(y_pred)[0]
    

    also works.