python tensorflow keras generative-adversarial-network

Difficulty in GAN training

I am attempting to train a GAN to learn the distribution of a number of features in an event. The Discriminator and Generator trained have a low loss but the generated events have different shaped distributions but I am unsure why.

I define the GAN as follow:

def create_generator():

    generator = Sequential()

    generator.add(Dense(50,input_dim=noise_dim))
    generator.add(LeakyReLU(0.2))    
    generator.add(Dense(25))
    generator.add(LeakyReLU(0.2))
    generator.add(Dense(5))
    generator.add(LeakyReLU(0.2))
    generator.add(Dense(len(variables), activation='tanh'))

    return generator


def create_descriminator():
    discriminator = Sequential()

    discriminator.add(Dense(4, input_dim=len(variables)))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(4))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(4))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(1, activation='sigmoid'))   
    discriminator.compile(loss='binary_crossentropy', optimizer=optimizer)
    return discriminator


discriminator = create_descriminator()
generator = create_generator()

def define_gan(generator, discriminator):
    # make weights in the discriminator not trainable
    discriminator.trainable = False
    model = Sequential()
    model.add(generator)
    model.add(discriminator)
    model.compile(loss = 'binary_crossentropy', optimizer=optimizer)
    return model

gan = define_gan(generator, discriminator)

And I train the GAN using this loop:

for epoch in range(epochs):
    for batch in range(steps_per_epoch):
        noise = np.random.normal(0, 1, size=(batch_size, noise_dim))
        fake_x = generator.predict(noise)

        real_x = x_train[np.random.randint(0, x_train.shape[0], size=batch_size)]

        x = np.concatenate((real_x, fake_x))
        # Real events have label 1, fake events have label 0
        disc_y = np.zeros(2*batch_size)
        disc_y[:batch_size] = 1

        discriminator.trainable = True
        d_loss = discriminator.train_on_batch(x, disc_y)

        discriminator.trainable = False
        y_gen = np.ones(batch_size)
        g_loss = gan.train_on_batch(noise, y_gen)

My real events are scaled using the sklearn standard scaler:

scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)

Generating events:

X_noise = np.random.normal(0, 1, size=(n_events, GAN_noise_size))
X_generated = generator.predict(X_noise)

When I then use the trained GAN after training for a few hundred to a few thousand epochs to generate new events and unscaling I get distributions that look like this:

And plotting two of the features against each other for the real and fake events gives:

This looks similar to mode collapse but I don't see how that could lead to these extremal values where everything is cut off beyond those points.

Solution

Mode collapse results in the generator finding a few values or small range of values that do the best at fooling the discriminator. Since your range of generated values is fairly narrow, I believe you are experiencing mode collapse. You can train for different durations and plot the results to see when collapse occurs. Sometimes, if you train long enough, it will fix itself and start learning again. There are a billion recommendations on how to train GANs, I collected bunch and then brute force my way through them for each GAN. You could try only training the discriminator every other cycle, in order to give the generator a chance to learn. Also, several people recommend not training the discriminator on real and fake data at the same time (I haven't done it so can't say what, if any, the impact is). You might also want to try adding in some batch normalization layers. Jason Brownlee has a bunch of good articles on training GANs, you may want to start there.