Search code examples
python-3.xmachine-learningartificial-intelligencechainerdcgan

Modifying size of input images for Chainer DCGAN model


I'm using the Chainer DCGAN example file found at https://github.com/chainer/chainer/blob/master/examples/dcgan/train_dcgan.py . It works fine for 32x32 images, but for other resolutions, the README.md instructs to modify the network architecture in net.py.

As I understand it from reading the documentation, the size of the training images is sent as a parameter to the constructor to the Generator class, as bottom_width, and ch. Here is the code for a 32x32.

class Generator(chainer.Chain):

    def __init__(self, n_hidden, bottom_width=4, ch=512, wscale=0.02):

I'm confused as to how this translates to 32x32, and how to modify this to other resolutions. Any help would be greatly appreciated.


Solution

  • You can calculate it by understanding the behavior of Deconvolution2D. In net.py, 3 Deconvolution2D layer (self.dc1, self.dc2, self.dc3) is defined with stride=2 (4-th argument of L.Deconvolution2D), which doubles the input height/width.

    As a result, output size will be bottom_size * 2^3, which results to 32 when bottom_size=4.

    So for example, if you want to get 64x64 images you can set bottom_size=8 for both generator & discriminator (but you need 64x64 images as real data, instead of cifar-100 which is 32x32 images).

    Please refer official document for the details of input-output size relationship.