Image dimension mismatch while trying to add Noise to image using Keras Sequential

To Recreate this question's ask on your system, please find the Source code and Dataset here

What I am trying?
I am trying to create a simple GAN (Generative Adversarial N/w) where I am trying to recolor Black and White images using a few ImageNet images.

What Process am I following?
I have take a few Dog images, which are stored in folder ./ImageNet/dogs/ directory. Using Python code I have created 2 more steps where I convert

  1. Dog images into 244 x 244 resolution and save in ./ImageNet/dogs_lowres/
  2. Dog Low Res. images into Grayscale and save in ./ImageNet/dogs_bnw/
  3. Feed the Low Res BnW images to GAN model and generate colored images.

Where am I Stuck?
I am stuck at understanding how the Image dimensions / shape are used. I am getting the error as such:

ValueError: `logits` and `labels` must have the same shape, received ((32, 28, 28, 3) vs (32, 224, 224)).

Here's the code for Generator and Discriminator:

# GAN model for recoloring black and white images
generator = Sequential()
generator.add(Dense(7 * 7 * 128, input_dim=100))
generator.add(Reshape((7, 7, 128)))
generator.add(Conv2DTranspose(64, kernel_size=5, strides=2, padding='same'))
generator.add(Conv2DTranspose(32, kernel_size=5, strides=2, padding='same'))
generator.add(Conv2DTranspose(3, kernel_size=5, activation='sigmoid', padding='same'))

# Discriminator model
discriminator = Sequential()
discriminator.add(Flatten(input_shape=(224, 224, 3)))
discriminator.add(Dense(1, activation='sigmoid'))

# Compile the generator model
optimizer = Adam(learning_rate=0.0002, beta_1=0.5)
generator.compile(loss='binary_crossentropy', optimizer=optimizer)

# Train the GAN to recolor images
epochs = 10000
batch_size = 32

and the training loop is as follows:

for epoch in range(epochs):
    idx = np.random.randint(0, bw_images.shape[0], batch_size)
    real_images = bw_images[idx]

    noise = np.random.normal(0, 1, (batch_size, 100))
    generated_images = generator.predict(noise)

    # noise_rs = noise.reshape(-1, 1)
    g_loss = generator.train_on_batch(noise, real_images)

    if epoch % 100 == 0:
        print(f"Epoch: {epoch}, Generator Loss: {g_loss}")

Where is the Error? I get error on line:
g_loss = generator.train_on_batch(noise, real_images)

When I check for the shape of noise and real_images objects, this is what I get:

(32, 224, 224)
(32, 100)

Any help/suggestion is appreciated.


  • generator outputs [32 28 28 3], whereas it is getting a target of shape [32 224 224]. The target has two differences: it is greyscale rather than colour, and has larger dimensions.

    I am assuming the target supplied to the generator should be colour rather than grayscale. You can load the colour images and resize them using:

    def load_images_color(directory):
        images = []
        for filename in os.listdir(directory):
            img_path = os.path.join(directory, filename)
            img = cv2.imread(img_path)
            img = cv2.resize(img, (224, 224))  # Resize images to 224x224
            img = img.astype('float32') / 255.0  # Normalize pixel values
        return np.array(images)
    # Load colour images
    cl_images = load_images_color('./ImageNet/dogs')
    for epoch in range(epochs):
        cl_real = cl_images[idx]
        #Resize colour images to match generator output shape
        cl_real_small = []
        for im in cl_real:
            cl_real_small.append( cv2.resize(im, (28, 28)) )
        cl_real_small = np.array(cl_real_small)
        g_loss = generator.train_on_batch(noise, cl_real_small)