Hello? I'm trying to write custom loss function with external data.
My code is like that:
classes_distributions = K.variable(np.full((A, A), 1.0 / A, dtype=float))
def my_loss(distributions):
def loss(y_true, y_pred):
x = math_ops.argmax(y_pred, axis=0)
for i in range(A):
classes_slice = distributions[x[i], :]
_class = math_ops.add(classes_slice, K.flatten(y_true[:, i]))
_class = math_ops.multiply(_class, 1.0 / K.sum(_class))
classes_slice.assign(_class)
# Mean enthropy of classes.
log_p = math_ops.log(distributions)
H = - math_ops.multiply(distributions, log_p)
H = K.sum(H, axis=1)
H = K.mean(H)
return tf.fill((A,), H)
return loss
model = Sequential()
model.add(LSTM(units=A*10, input_shape=(A, 1)))
model.add(Dense(units=A, activation='softmax'))
model.compile(loss=my_loss(classes_distributions))
history = model.fit(
data,
data, # The main idea is using loss-function only
epochs=200,
verbose=True,
batch_size=A)
But I have error:
ValueError: in user code:
...
ValueError: No gradients provided for any variable: (['lstm_69/lstm_cell_69/kernel:0', 'lstm_69/lstm_cell_69/recurrent_kernel:0', 'lstm_69/lstm_cell_69/bias:0', 'dense_69/kernel:0', 'dense_69/bias:0'],). Provided `grads_and_vars` is ((None, ), (None, ), (None, ), (None, ), (None, )).
I understand, that problem in return tf.fill((A,), H)
, but have no idea, where is mistake.
Maybe it is some dimentional problem, because if i replace
return tf.fill((A,), H)
with
return K.mean(K.square(y_true - y_pred), axis=-1)
(from tensorflow sample code), it's OK.
But
tf.print(tf.shape(K.mean(K.square(y_true - y_pred), axis=-1)))
tf.print(tf.shape(tf.fill((A,), H)))
returns
[35]
[35]
And this is the same shape!
Thank's to Dr. Snoopy in comments.
O = tf.zeros(tf.shape(y_pred))
O = math_ops.multiply(O, y_pred)
O = math_ops.add(O, distributions)
log_p = math_ops.log(O)
H = - math_ops.multiply(O, log_p)
H = K.sum(H, axis=1)
H = K.mean(H)
return tf.fill((A,), H)
solves my problem.
Funny, because it is multiplying by zero and adding the answer for connecting y_pred with output.
And
O = math_ops.multiply(tf.zeros(tf.shape(y_pred)), y_pred)
O = K.mean(O)
# Some math with scalar-tensor output loss.
return loss + O # Corret here for batch_size = tf.shape(y_pred)[0]
also works.