tensorflow neural-network conv-neural-network mnist dcgan

TensorFlow MNIST DCGAN: how to set up the loss function?

I would like to build a DCGAN for MNIST by myself in TensorFlow. However, I'm struggling to find out how I should set up the loss function for the generator. In a Keras DCGAN implementation the author used a little "workaround" for this problem: he simply built 3 models. The generator (G), the discriminator (D) and third one, where he just combined G with D, while setting the train-ability of D to false there.

This way, he can feed D with real images + generated images to train D and train the G+D-combined model, because the loss of D is propagated to G, since D is not trainable in the G+D-combined model.

In TensorFlow, I've built G and D already. Training D is relatively simple, since I just need to combine a batch of real MNIST training images with generated ones and call the training op:

session.run(D_train_op,
            feed_dict={x: batch_x, y: batch_y})

The training op in this example is a binary cross entropy:

tf.losses.softmax_cross_entropy(y, D_out)

...but how would I set up the loss function for G, when I do not have a "stacked" model, combining "G and D" to single, third model?

I know that I have to generate a batch of images with G, feed them into D and then I can obtain the loss of D...however, the output of G is of shape (batch_size, 28, 28, 1). How would I set up a loss function for G by hand?

Without the "G and D"-combined model "workaround" for this, I have to propagate the loss of D, which has an output shape of (batch_size, 1) to the output layer of G.

If G would do some classification for example, this wouldn't be that hard to figure out...but G outputs images. Thus, I can not directly map the loss of D to the output layer of G.

Do I have to set up a third model combining G+D? Or is there a way to calculate the loss for G by hand?

Any help is highly appreciated :)

Solution

In the generator step training, you can think that the network involves the discriminator too. But to do the backpropagation, you will only consider the generator weights. A good explanation for it is found here.

As mentioned in original paper, the Discriminator cost is:

And the generator cost is:

Of course, you don't need to calculate it by hand. Tensorflow already handles it. To do all the process, you can implement the following:

G_sample = generator(z)
D_real = discriminator(X)
D_fake = discriminator(G_sample)

D_loss = tf.reduce_mean(-tf.log(D_real)-tf.log(1-D_fake))
G_loss = tf.reduce_mean(-tf.log(D_fake))

where D_real, D_fake and D_sample are the last layers of your network. Then you can implement the training process by the standard way:

D_solver = (tf.train.AdamOptimizer(learning_rate=0.0001,beta1=0.5)
            .minimize(D_loss, var_list=theta_D))
G_solver = (tf.train.AdamOptimizer(learning_rate=0.0001,beta1=0.5)
            .minimize(G_loss, var_list=theta_G))

And just run the solvers in a session.