python function autoencoder activation softmax

Conditional Variational Autoencoder for cocktail recipe generation

I want to use a conditional variational autoencoder to generate cocktail recipes. I modified the code from this repo so it can read my own data. The input is an array of all the possible ingredients, so most of the entries have the value 0. If an ingredient is present, it gets a value which is the amount normalized by 250 ml. The last index is what is 'left over' to make sure a cocktail always adds op to 1.

Example:

0,0.0,0.0,0.24,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.6,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.088,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0120000000000000

The output with a softmax activation function looks a bit like this:

[5.8228267e-10 6.7397465e-10 1.9761790e-08 2.3713847e-01 3.1315527e-11
 4.9592632e-11 4.2637563e-05 7.6098106e-10 2.9357905e-05 1.3291576e-08
 2.6885323e-09 4.2986945e-10 3.0274603e-09 8.6994453e-11 3.2391853e-10
 3.3694150e-10 4.9642315e-11 2.2861177e-10 2.5966980e-11 3.3872125e-10
 4.8175470e-12 1.1207919e-09 7.8108942e-10 1.0438563e-09 4.7190268e-12
 2.2692757e-09 3.3177341e-10 4.7493649e-09 1.6603904e-08 2.7854623e-11
 1.1586791e-07 2.3917833e-08 1.0172608e-09 2.2049740e-06 4.0200213e-10
 4.8334226e-05 1.9393491e-09 4.0731374e-10 4.5671125e-10 8.5878060e-10
 1.3625046e-10 1.7755342e-09 2.4927729e-09 3.8919952e-09 1.6791472e-10
 1.5160178e-09 9.0631114e-10 1.2043951e-08 2.1420650e-01 1.4531254e-10
 3.9913628e-10 4.6368896e-06 6.8399265e-11 2.4654754e-09 6.5392605e-12
 5.8443012e-10 2.7861690e-11 4.7215394e-08 5.1503157e-09 5.4484850e-10
 1.9266211e-10 7.2835156e-09 6.4243433e-10 4.2432866e-09 4.2630177e-08
 1.1281617e-12 1.8015703e-08 3.5657147e-10 3.4241193e-11 4.8394988e-10
 9.6064046e-11 2.9857121e-02 3.8048144e-11 1.1893182e-10 5.1867032e-01]

How can I make sure that the values are only distributed among a couple of ingredients and the rest of the ingredients get 0, similar to the input? Is this a matter of changing the activation functions?

Thanks :)

Solution

I'm not sure you want to use probabilities here. It seems you're doing a regression to some specific values. Hence, it would make sense to not use a softmax, and use a simple mean-squared-error loss.

Note that if certain values are always biased in your loss, you can just use an extra weight on your loss, or use some abstraction (e.g. Keras's class_weight).

For this task you could consider using Keras, especially for this task it would make sense. There is an example checked into master: https://github.com/keras-team/keras/blob/master/examples/variational_autoencoder.py

For this task it might actually make sense to use a GAN: https://github.com/keras-team/keras/blob/master/examples/mnist_acgan.py . You'll let it distinguish between a random cocktail and a 'real' cocktail. It will learn to distinguish between the two, and in the process, it will train the weights to be able to create a generator that will generate cocktails for you!