Search code examples
machine-learningconv-neural-networkdeconvolutionimage-generation

generative adversarial network generating image with some random pixels


I am trying to generate images using Generative Adversarial Networks(GANs) on CelebA aligned data set with each image resized to 64*64 in .jpeg format. My network definition is like this GANs architecture

def my_discriminator(input_var= None):
    net = lasagne.layers.InputLayer(shape= (None, 3,64,64), input_var = input_var)
    net = lasagne.layers.Conv2DLayer(net, 64, filter_size= (6,6 ),stride = 2,pad=2,W = lasagne.init.HeUniform(), nonlinearity= lasagne.nonlinearities.LeakyRectify(0.2))#64*32*32
    net = lasagne.layers.Conv2DLayer(net, 128, filter_size= (6,6),stride = 2,pad=2,W = lasagne.init.HeUniform(), nonlinearity= lasagne.nonlinearities.LeakyRectify(0.2))#128*16*16
    net = lasagne.layers.Conv2DLayer(net, 256, filter_size= (6,6),stride = 2,pad=2,W = lasagne.init.HeUniform(), nonlinearity= lasagne.nonlinearities.LeakyRectify(0.2))#256*8*8
    net = lasagne.layers.Conv2DLayer(net, 512, filter_size= (6,6),stride = 2,pad=2,W = lasagne.init.HeUniform(), nonlinearity= lasagne.nonlinearities.LeakyRectify(0.2))#512*4*4
    net = lasagne.layers.DenseLayer(net, 2048, W= lasagne.init.HeUniform(), nonlinearity= lasagne.nonlinearities.LeakyRectify(0.2))
    net = lasagne.layers.DenseLayer(net, 1, nonlinearity = lasagne.nonlinearities.sigmoid)

def my_generator(input_var=None):
    gen_net = lasagne.layers.InputLayer(shape = (None, 100), input_var = input_var)
    gen_net = lasagne.layers.DenseLayer(gen_net, 2048, W= lasagne.init.HeUniform())
    gen_net = lasagne.layers.DenseLayer(gen_net, 512*4*4, W= lasagne.init.HeUniform())
    gen_net = lasagne.layers.ReshapeLayer(gen_net, shape = ([0],512,4,4))
    gen_net = lasagne.layers.Deconv2DLayer(gen_net, 256,filter_size= (6,6),stride = 2,crop=2, W= lasagne.init.HeUniform(), nonlinearity= lasagne.nonlinearities.rectify)
    gen_net = lasagne.layers.Deconv2DLayer(gen_net, 128,filter_size= (6,6),stride = 2,crop=2, W= lasagne.init.HeUniform(), nonlinearity= lasagne.nonlinearities.rectify)
    gen_net = lasagne.layers.Deconv2DLayer(gen_net, 64, filter_size= (6,6), stride=2,crop=2,W= lasagne.init.HeUniform(), nonlinearity= lasagne.nonlinearities.rectify)
    gen_net = lasagne.layers.Deconv2DLayer(gen_net, 3, filter_size= (6,6),stride = 2,crop=2, nonlinearity= lasagne.nonlinearities.tanh)

With the images generated by generator, I am getting some randomly colored pixels and also a "grid" like structure in them as can be seen in the example image: enter image description here

My question is what are the reasons for these two problems, I also used almost the same architecture with one less Convolution layer in Generator and Discriminator on Cifar-10 datset with 32*32 resolution images in .png format, but there the generated images were not like this. Not sure if the image format could be the reason. I would be very thankful if someone could provide some ideas or ways or links, anything to get rid of such issues.


Solution

  • The reasons for these problems were:-

    1. Random pixels- Normalization of the image data must be in accordance with the activation function of the last layer of the Generator [-1,1] -> tanh

    2. "Grid" in generated images- The way each image's dimensions were changed. So should use 'transpose' function instead of 'reshape', to convert (64,64,3)->(3,64,64)