Search code examples
pythontensorflowkerasautoencoderloss-function

Variational Autoencoder with multiple in and outputs


I have built an auto encoder in Keras, that accepts multiple inputs and the same umber of outputs that I would like to convert into a variational auto encoder. I am having trouble to combine the loss of the difference between input and output and the loss of the variational part.

What I want to achieve:

The auto encoder should be used for a data set containing both numeric and categorical data. To do so I normalize the numerical columns and 1-hot-encode the categorical columns. Since the resulting categorical vectors and the numerical vectors require different loss functions (mean-sqaured-error for the numerical, and categorical cross-entropy for the categorical columns) and the very large 1-hot encoding vectors would dominate the loss as compared to the small numerical columns, I have decided to put each column as its own input vector. Thus my auto encoder accepts a set of input vectors generates output vectors of the same number and shapes.

What I have done so far:

This is a setup for two numeric inputs and two categorical ones with 20 and 30 wide 1-hot encoding:

encWidth = 3
## Encoder
x = Concatenate(axis=1)([ Input(1,),Input(1,),Input(20,),Input(30,) ]) #<-configurable
x = Dense( 32, activation="relu")(x)
layEncOut = Dense( encWidth, activation="linear")(x)

layDecIn = Input( encWidth, name="In_Encoder" )
x = Dense( 32, activation="relu")(layDecIn)
layDecOut = [ outLayer(x) for outLayer in C.layOutputs ]

encoder = Model(C.layInputs, layEncOut, name="encoder")
decoder = Model( layDecIn, layDecOut, name="decoder" )

AE = Model(C.layInputs, decoder(encoder(C.layInputs)), name="autoencoder")
AE.compile(optimizer="adam", 
           loss=['mean_squared_error', 'mean_squared_error',
                 'categorical_crossentropy', 'categorical_crossentropy',], #<-configurable
                 loss_weights=[1.0, 1.0, 1.0, 1.0] #<-configurable
          )

This example is static, but in my implementation the numerical and categorical fields are configurable, so the Inputs, the kind of loss functions and the loss weights should be configurable from an object that stores the original columns from the data set.

....
## Encoder
x = Concatenate(axis=1)( C.layInputs )
...
AE.compile(optimizer="adam", 
           loss=C.losses
           loss_weights=C.lossWeights
          )

Here C is an instance of a class, that has the input layer and the loss-functions/weights depending on what columns I want to have in the auto encoder.

My problem:

I have extended the set up to an variational auto encoder now, with the latent layer of a mean and standard deviation.

encWidth = 2

## Encoder
x = Concatenate(axis=1)(C.layInputs)
x = Dense( 32, activation="relu")(x)

### variational part
z_mean = Dense(encWidth, name='z_mean', activation=lrelu)(x)
z_log_var = Dense(encWidth, name='z_log_var', activation=lrelu)(x)
z = Lambda(sampling, name='z')([z_mean, z_log_var])

## Decoder
layDecodeInput = Input( encWidth, name="In_Encoder" )
x = Dense( 32, activation="relu")(layDecodeInput)
layOutDecoder = [ outLayer(x) for outLayer in C.layOutputs ]

### build the encoder model
vEncoder = Model(C.layInputs, [z_mean, z_log_var, z], name='v_encoder')

### build the decoder model
vDecoder = Model( layDecodeInput, layOutDecoder, name="v_decoder" )

## Autoencoder
vAE = Model(C.layInputs, vDecoder(vEncoder(C.layInputs)[2]))
vae_loss = variational_loss(z_mean, z_log_var)
vAE.compile(optimizer="adam",
            loss=vae_loss)

Now, I need a custom error function, that combines the loss of the difference between input and output (as in the previous example) with the loss in the variantional part; this is what I have come up with so far:

def variational_loss(z_mean, z_log_var, varLossWeight=1.):    
    def lossFct(yTrue, yPred):       

        var_loss = -0.5 * K.mean(1 + z_log_var - K.square(z_mean) - K.exp(z_log_var))

        lossFunctions = [getattr(losses, "mean_squared_error") for losses in  C.losses]
        ac_loss = [
          lossFkt(yTrue, yPred) * lossWeigt for
          yt, yp, lossFkt, lossWeigt in zip(yTrue, yPred, lossFunctions, C.lossWeights) ]

        loss =  K.mean( ac_loss + [ kl_loss * varLossWeight ] )
        return loss
    return lossFct

So it is a generator function, that returns a function accepting yTrue and yPredicted but works in the variational part. The for loop should loop over all inputs and corresponding outputs and compare them using the appropriate loss function (either mean-squared-error for numerical or categorical-cross-entropy for categorical features)

But apparently the for loop to loop over the set of input vectors and compare them with the set of output vectors is not permitted; I get an error

Tensor objects are only iterable when eager execution is enabled. To iterate over this tensor use tf.map_fn.

How I can get the convenient behavior of the Model.compile() dunction where I can just tell to use different loss functions on different in and inputs and outputs, combined with the variational loss?


Solution

  • I think it will be simpler to add aKL divergence layer in the network that takes care of the VAE loss. You can do it like this, (where beta is the weight of the vae loss):

    import keras.backend as K
    from keras.layers import Layer
    
    class KLDivergenceLayer(Layer):
    
        """ Identity transform layer that adds KL divergence
        to the final model loss.
        """
    
        def __init__(self, beta=.5, *args, **kwargs):
            self.is_placeholder = True
            self.beta = beta
            super(KLDivergenceLayer, self).__init__(*args, **kwargs)
    
        def call(self, inputs):
    
            mu, log_var = inputs
    
            kl_batch = - self.beta * K.sum(1 + log_var -
                                    K.square(mu) -
                                    K.exp(log_var), axis=-1)
    
            self.add_loss(K.mean(kl_batch), inputs=inputs)
    
            return inputs
    

    Then you can add this line in your code, after you calculate the mean and log var:

    z_mean , z_log_var = KLDivergenceLayer()([z_mean , z_log_var])
    

    This layer is an identity layer that add the KL loss to the final loss. Then your final loss can just be the one you were using above.

    I found this the post from Luis C. Tiao: https://tiao.io/post/tutorial-on-variational-autoencoders-with-a-concise-keras-implementation/