Python train convolutional neural network on csv numpy error input shape

I would like to train a convolutional neural network autoencoder on a csv file. The csv file contains pixel neighborhood position of an original image of 1024x1024. When I try to train it, I have the following error that I don't manage to resolve. ValueError: Input 0 of layer max_pooling2d is incompatible with the layer: expected ndim=4, found ndim=5. Full shape received: (None, 1, 1024, 1024, 16). Any idea, what I am doing wrong in my coding?

Let's me explain my code:

My csv file has this structure:

0 0 1.875223e+01
1 0 1.875223e+01
2 0 2.637685e+01
3 0 2.637685e+01
4 0 2.637685e+01

I managed to load my dataset, extract the x, y, and value columns as NumPy arrays and extract the relevant columns as NumPy arrays

x = data[0].values
y = data[1].values
values = data[2].values

Then, I create an empty image with the correct dimensions and fill in the image with the pixel values

image = np.empty((1024, 1024))


for i, (xi, yi, value) in enumerate(zip(x, y, values)):
    image[xi.astype(int), yi.astype(int)] = value

To use this array as input to my convolutional autoencoder I reshaped it to a 4D array with dimensions

# Reshape the image array to a 4D tensor
image = image.reshape((1, image.shape[0], image.shape[1], 1))

Finally, I declare the convolutional autoencoder structure, at this stage I have the error `incompatible with the layer: expected ndim=4, found ndim=5. Full shape received: (None, 1, 1024, 1024, 16)'

import keras
from keras.layers import Input, Conv2D, MaxPooling2D, UpSampling2D
from keras.models import Model

# Define the input layer
input_layer = Input(shape=(1,image.shape[1], image.shape[2], 1))

# Define the encoder layers
x = Conv2D(16, (3, 3), activation='relu', padding='same')(input_layer)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)

# Define the decoder layers
x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Conv2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

# Define the autoencoder model
autoencoder = Model(input_layer, decoded)

# Compile the model
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

# Reshape the image array to a 4D tensor
image = image.reshape((1, image.shape[0], image.shape[1], 1))

# Train the model
autoencoder.fit(image, image, epochs=50, batch_size=1, shuffle=True)

Solution

You should drop the first dimension in the input layer:

input_layer = Input(shape=(image.shape[1], image.shape[2], 1))

Training pool shape and input layer shape should be different. Input layer shape describes the shape of a single datapoint, while training pool shape describes whole dataset. Hence it has one more dimension with size equal to the number of data points in your dataset.

You should also drop second image.reshape since at that time image.shape == (1, 1024, 1024, 1) and doing image = image.reshape((1, image.shape[0], image.shape[1], 1)) tries to reshape that into (1, 1, 1024, 1) which is impossible.

And lastly you forgot to add padding='same' to one of the Conv2D layers. Add that to match output layer shape with your training label shape.