I am currently looking into CycleGAN and im using simontomaskarlssons github repository as my baseline. My problem arises when the training is done and I want to use the saved model to generate new samples. Here the model architecture for the loaded model are different from the initialized generator. The direct link for the saveModel function is here.
When I initialize the generator that does the translation from domain A to B the summary looks like the following (line in github). This is as expected since my input image is (140,140,1) and I am expecting an output image as (140,140,1):
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) (None, 140, 140, 1) 0
__________________________________________________________________________________________________
reflection_padding2d_1 (Reflect (None, 146, 146, 1) 0 input_5[0][0]
__________________________________________________________________________________________________
conv2d_9 (Conv2D) (None, 140, 140, 32) 1600 reflection_padding2d_1[0][0]
__________________________________________________________________________________________________
instance_normalization_5 (Insta (None, 140, 140, 32) 64 conv2d_9[0][0]
__________________________________________________________________________________________________
...
__________________________________________________________________________________________________
activation_12 (Activation) (None, 140, 140, 32) 0 instance_normalization_23[0][0]
__________________________________________________________________________________________________
reflection_padding2d_16 (Reflec (None, 146, 146, 32) 0 activation_12[0][0]
__________________________________________________________________________________________________
conv2d_26 (Conv2D) (None, 140, 140, 1) 1569 reflection_padding2d_16[0][0]
__________________________________________________________________________________________________
activation_13 (Activation) (None, 140, 140, 1) 0 conv2d_26[0][0]
==================================================================================================
Total params: 2,258,177
Trainable params: 2,258,177
Non-trainable params: 0
When the training is done I want to load the saved models to generate new samples (translation from domain A to domain B). In this case it does not matter if the model is successful at translating the images or not. I load the model with the following code:
# load json and create model
json_file = open('G_A2B_model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json, custom_objects={'ReflectionPadding2D': ReflectionPadding2D, 'InstanceNormalization': InstanceNormalization})
or the following which gives the same result.
loaded_model = load_model('G_A2B_model.h5', custom_objects={'ReflectionPadding2D': ReflectionPadding2D, 'InstanceNormalization': InstanceNormalization})
Where ReflectionPadding2D is initialized as (note that I have a separate file for loading the model then for training CycleGAN):
# reflection padding taken from
# https://github.com/fastai/courses/blob/master/deeplearning2/neural-style.ipynb
class ReflectionPadding2D(Layer):
def __init__(self, padding=(1, 1), **kwargs):
self.padding = tuple(padding)
self.input_spec = [InputSpec(ndim=4)]
super(ReflectionPadding2D, self).__init__(**kwargs)
def compute_output_shape(self, s):
return (s[0], s[1] + 2 * self.padding[0], s[2] + 2 * self.padding[1], s[3])
def call(self, x, mask=None):
w_pad, h_pad = self.padding
return tf.pad(x, [[0, 0], [h_pad, h_pad], [w_pad, w_pad], [0, 0]], 'REFLECT')
Now that my model is loaded I want to translate images from domain A to domain B. Here I expected the output shape to be (140,140,1) but surprisingly it is (132,132,1). I checked the architecture summary for G_A2B_model that clearly shows that the output is of shape (132,132,1):
Model: "G_A2B_model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) (None, 140, 140, 1) 0
__________________________________________________________________________________________________
reflection_padding2d_1 (Reflect (None, 142, 142, 1) 0 input_5[0][0]
__________________________________________________________________________________________________
conv2d_9 (Conv2D) (None, 136, 136, 32) 1600 reflection_padding2d_1[0][0]
__________________________________________________________________________________________________
instance_normalization_5 (Insta (None, 136, 136, 32) 64 conv2d_9[0][0]
__________________________________________________________________________________________________
...
__________________________________________________________________________________________________
instance_normalization_23 (Inst (None, 136, 136, 32) 64 conv2d_transpose_2[0][0]
__________________________________________________________________________________________________
activation_12 (Activation) (None, 136, 136, 32) 0 instance_normalization_23[0][0]
__________________________________________________________________________________________________
reflection_padding2d_16 (Reflec (None, 138, 138, 32) 0 activation_12[0][0]
__________________________________________________________________________________________________
conv2d_26 (Conv2D) (None, 132, 132, 1) 1569 reflection_padding2d_16[0][0]
__________________________________________________________________________________________________
activation_13 (Activation) (None, 132, 132, 1) 0 conv2d_26[0][0]
==================================================================================================
Total params: 2,258,177
Trainable params: 2,258,177
Non-trainable params: 0
What I don't understand is why the output shape is (132x132x1). But I can see that hte problem arises in the reflectionPadding2D where the output shape of the initialized generator is (146,146,1) and the output shape of save generator is (142,142,1). But I have no idea why this is happening? Because they should in theory be the same size.
When you persist your architecture using model.to_json
, the method get_config
is called so that the layer attributes are saved as well. As you are using a custom class without that method, the default value for padding is being used when you call model_from_json
.
Using the following code for ReflectionPadding2D
should solve your problem, just run the training step again and reload the model.
class ReflectionPadding2D(Layer):
def __init__(self, padding=(1,1), **kwargs):
self.padding = tuple(padding)
super(ReflectionPadding2D, self).__init__(**kwargs)
def compute_output_shape(self, s):
return (s[0], s[1] + 2 * self.padding[0], s[2] + 2 * self.padding[1], s[3])
def call(self, x, mask=None):
w_pad, h_pad = self.padding
return tf.pad(x, [[0, 0], [h_pad, h_pad], [w_pad, w_pad], [0, 0]], 'REFLECT')
# This is the relevant method that should be added
def get_config(self):
config = {
'padding': self.padding
}
base_config = super(ReflectionPadding2D, self).get_config()
return dict(list(base_config.items()) + list(config.items()))