Search code examples
pythontensorflowkerasneural-networkloss-function

Custom loss function with weights in Keras


I'm new with neural networks. I wanted to make a custom loss function in TensorFlow, but I need to get a vector of weights, so I did it in this way:

def my_loss(weights):
  def custom_loss(y, y_pred):
    return weights*(y - y_pred)
  return custom_loss
model.compile(optimizer='adam', loss=my_loss(weights), metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=None,  validation_data=(x_test, y_test), epochs=100)

When I launch it, I receive this error:

InvalidArgumentError:  Incompatible shapes: [50000,10] vs. [32,10]

The shapes are:

print(weights.shape)
print(y_train.shape)
(50000, 10)
(50000, 10)

So I thought that it was a problem with the batches, I don't have a strong background with TensorFlow, so I tried to solve in a naive way using a global variable

batch_index = 0

and then updating it within a custom callback into the "on_batch_begin" hook. But it didn't work and it was a horrible solution. So, how can I get the exact part of the weights with the corresponding y? Do I have a way to get the current batch index inside the custom loss? Thank you in advance for your help


Solution

  • this is a workaround to pass additional arguments to a custom loss function, in your case an array of weights. the trick consists in using fake inputs which are useful to build and use the loss in the correct ways. don't forget that keras handles fixed batch dimension

    I provide a dummy example in a regression problem

    def mse(y_true, y_pred, weights):
        error = y_true-y_pred
        return K.mean(K.square(error) + K.sqrt(weights))
    
    X = np.random.uniform(0,1, (1000,10))
    y = np.random.uniform(0,1, 1000)
    w = np.random.uniform(0,1, 1000)
    
    inp = Input((10,))
    true = Input((1,))
    weights = Input((1,))
    x = Dense(32, activation='relu')(inp)
    out = Dense(1)(x)
    
    m = Model([inp,true,weights], out)
    m.add_loss( mse( true, out, weights ) )
    m.compile(loss=None, optimizer='adam')
    m.fit(x=[X, y, w], y=None, epochs=3)
    
    ## final fitted model to compute predictions (remove W if not needed)
    final_m = Model(inp, out)