I am adapting this implementation of VAE https://github.com/keras-team/keras/blob/master/examples/variational_autoencoder.py that I found here https://blog.keras.io/building-autoencoders-in-keras.html
This implementation does not use convolutional layers so everything happens in 1D so to speak. My goal is to implement 3D convolutional layers within this model.
However I run into a shape mismatch at the loss function when running the batches (which are of 128 samples):
def vae_loss(self, x, x_decoded_mean):
xent_loss = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
#xent_loss.shape >> [128, 40, 20, 40, 1]
kl_loss = - 0.5 * K.sum(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var), axis=-1)
#kl_loss.shape >> [128]
return K.mean(xent_loss + kl_loss) # >> error shape mismatch
Almost the same question is answered here already Keras - Variational Autoencoder Incompatible shape for a model with 1D convolutional layers, but I can't really understand how to extrapolate the answer to my case wjich has a more complex Input shape.
I have tried this solution:
xent_loss = original_dim * metrics.binary_crossentropy(K.flatten(x), K.flatten(x_decoded_mean))
But I don't know whether it is a valid solution or not from a mathematical point of view, although now the model is running.
Your approach is right but it's highly dependent on K.binary_crossentropy
implementation. tensorflow
and theano
ones should work for you (as far as I know). To make it more clean and not implementation dependent I suggest you the following way:
xent_loss_vec = original_dim * metrics.binary_crossentropy(x, x_decoded_mean)
xent_loss = K.mean(xent_loss_vec, axis=[1, 2, 3, 4])
# xent_loss.shape = (128,)
Now you are taking a mean out of losses for each voxel and thanks to that every valid implementation of binary_crossentropy
should work fine for you.