Search code examples
pythontensorflowkeras

How to optimize multiple loss functions separately in Keras?


I am currently trying to build a deep learning model with three different loss functions in Keras. The first loss function is the typical mean squared error loss. The other two loss functions are the ones I built myself, which finds the difference between a calculation made from the input image and the output image (this code is a simplified version of what I'm doing).

def p_autoencoder_loss(yTrue,yPred):

    def loss(yTrue, y_Pred):
       return K.mean(K.square(yTrue - yPred), axis=-1)

    def a(image):
       return K.mean(K.sin(image))

    def b(image):
       return K.sqrt(K.cos(image))


a_pred = a(yPred)
a_true = a(yTrue)

b_pred = b(yPred)
b_true = b(yTrue)

empirical_loss = (loss(yTrue, yPred))
a_loss = K.mean(K.square(a_true - a_pred))
b_loss = K.mean(K.square(b_true - b_pred))
final_loss = K.mean(empirical_loss + a_loss + b_loss)
return final_loss

However, when I train with this loss function, it is simply not converging well. What I want to try is to minimize the three loss functions separately, not together by adding them into one loss function.

I essentially want to do the second option here Tensorflow: Multiple loss functions vs Multiple training ops but in Keras form. I also want the loss functions to be independent from each other. Is there a simple way to do this?


Solution

  • You could have 3 outputs in your keras model, each with your specified loss, and then keras has support for weighting these losses. It will also then generate a final combined loss for you in the output, but it will be optimising to reduce all three losses. Be wary with this though as depending on your data/problem/losses you might find it stalls slightly or is slow if you have losses fighting each other. This however requires use of the functional API. I'm unsure as to whether this actually implements separate optimiser instances, however I think this is as close you will get in pure Keras that i'm aware of without having to start writing more complex TF training regimes.

    For example:

    loss_out1 = layers.Dense(1, activation='sigmoid', name='loss1')(x)
    loss_out2 = layers.Dense(1, activation='sigmoid', name='loss2')(x)
    loss_out3 = layers.Dense(1, activation='sigmoid', name='loss3')(x)
    
    model = keras.Model(inputs=[input],
                    outputs=[loss1, loss2, loss3])
    model.compile(optimizer=keras.optimizers.RMSprop(1e-3),
              loss=['binary_crossentropy', 'categorical_crossentropy', 'custom_loss1'],
              loss_weights=[1., 1., 1.])
    

    This should compile a model with 3 outputs at the end from (x) which would be above. When you compile you set the outputs as a list as well as set the losses and loss weights as a list. Note that when you fit() that you'll need to supply your target outputs three times as a list too e.g. [y, y, y] as your model now has three outputs.

    I'm not a Keras expert, but it's pretty high-level and i'm not aware of another way using pure Keras. Hopefully someone can come correct me with a better solution!