Keras custom loss function that uses hidden layer outputs as part of the objective

I am trying to implement an autoencoder in Keras that not only minimizes the reconstruction error but its constructed features should also maximize a measure I define. I don't really have an idea of how to do this.

Here's a snippet of what I have so far:

corrupt_data = self._corrupt(self.data, 0.1)

# define encoder-decoder network structure
# create input layer
input_layer = Input(shape=(corrupt_data.shape[1], ))
encoded = Dense(self.encoding_dim, activation = "relu")(input_layer)
decoded = Dense(self.data.shape[1], activation="sigmoid")(encoded)

# create autoencoder
dae = Model(input_layer, decoded)

# define custom multitask loss with wlm measure
def multitask_loss(y_true, y_pred):
    # extract learned features from hidden layer
    learned_fea = Model(input_layer, encoded).predict(self.data)
    # additional measure I want to optimize from an external function
    wlm_measure = wlm.measure(learned_fea, self.labels)
    cross_entropy = losses.binary_crossentropy(y_true, y_pred)
    return wlm_measure + cross_entropy

# create optimizer
dae.compile(optimizer=self.optimizer, loss=multitask_loss)

dae.fit(corrupt_data, self.data, 
                epochs=self.epochs, batch_size=20, shuffle=True, 
                callbacks=[tensorboard])

# separately create an encoder model
encoder = Model(input_layer, encoded)

Currently this does not work properly... When I viewed the training history the model seems to ignore the additional measure and train only based on the cross entropy loss. Also if I change the loss function to consider only wlm measure, I get the error "numpy.float64" object has no attribute "get_shape" (I don't know if changing my wlm function's return type to a tensor will help).

There are a few places that I think may have gone wrong. I don't know if I am extracting the outputs of the hidden layer correctly in my custom loss function. Also I don't know if my wlm.measure function is outputting correctly—whether it should output numpy.float32 or a 1-dimensional tensor of type float32.

Basically a conventional loss function only cares about the output layer's predicted labels and the true labels. In my case, I also need to consider the hidden layer's output (activation), which is not that straightforward to implement in Keras.

Thanks for the help!

Solution

You don't want to define your learned_fea Model inside your custom loss function. Rather, you could define a single model upfront with two outputs: the output of the decoder (the reconstruction) and the output of the endoder (the feature representation):

multi_output_model = Model(inputs=input_layer, outputs=[decoded, encoded])

Now you can write a custom loss function that only applies to the output of the encoder:

def custom_loss(y_true, y_pred):
    return wlm.measure(y_pred, y_true)

Upon compiling the model, you pass a list of loss functions (or a dictionary if you name your tensors):

model.compile(loss=['binary_crossentropy', custom_loss], optimizer=...)

And fit the model by passing a list of outputs:

model.fit(X=X, y=[data_to_be_reconstructed,labels_for_wlm_measure])