Search code examples
pythonautoencoder

alueError: Input 0 of layer sequential is incompatible with the layer for 3D autoenccoder


I have a 3d image (32, 32, 32)in grayscale (is an image taken from a magnetic resonance image) and I'm trying to build a simple Autoencoder with it. The problem i'm getting is when I try to fit the model to the image (model.fit()) because I'm getting this error:

ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=5, found ndim=4. Full shape received: (32, 32, 32, 1)

This is a .nii image. Taken from other posts that ask the same question for conv2d I tried to adapt some answers and I did reshape but I don't know why it's still expecting ndim=5, shouldn't be the ndim 5 the batch dimension that keras add internally??

This is what I did:

cube = np.array(cube.get_fdata())
cube = cube.reshape(32, 32, 32, 1) 

This is the Autoencoder I built (It's my first time building it and for 3D images, so if there is something wrong with it please let me know):

sample_shape = (32, 32, 32, 1)
model = Sequential()
model.add(Conv3D(64, kernel_size=(3, 3, 3), activation='relu', padding='same', kernel_initializer='he_uniform', input_shape=sample_shape))
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Conv3D(32, kernel_size=(3, 3, 3), activation='relu', padding='same', kernel_initializer='he_uniform'))
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Conv3D(16, kernel_size=(3, 3, 3), activation='relu', padding='same', kernel_initializer='he_uniform'))
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
model.add(Conv3D(8, kernel_size=(3, 3, 3), activation='relu', padding='same', kernel_initializer='he_uniform'))
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding='same'))
     
model.add(Conv3D(8, kernel_size=(3, 3, 3), activation='relu', padding='same', kernel_initializer='he_uniform'))
model.add(UpSampling3D(size=(2, 2, 2)))
model.add(Conv3D(16, kernel_size=(3, 3, 3), activation='relu', padding='same', kernel_initializer='he_uniform'))
model.add(UpSampling3D(size=(2, 2, 2)))
model.add(Conv3D(32, kernel_size=(3, 3, 3), activation='relu', padding='same', kernel_initializer='he_uniform'))
model.add(UpSampling3D(size=(2, 2, 2)))
model.add(Conv3D(64, kernel_size=(3, 3, 3), activation='relu', padding='same', kernel_initializer='he_uniform'))
model.add(UpSampling3D(size=(2, 2, 2)))
model.add(Conv3D(3, kernel_size=(3, 3, 3), activation='relu', padding='same', kernel_initializer='he_uniform'))

model.compile(optimizer='adam', loss='mean_squared_error', metrics=['accuracy'])
model.summary()

Thanks!


Solution

  • Let's unpack each of the dimensions of your data.

    cube = np.array(cube.get_fdata())
    cube = cube.reshape(32, 32, 32, 1)
    # should be the line below. -1 means infer that number from the data
    # cube = cube.reshape(-1, 32, 32, 32, 1)
    

    The three 32s are your 3D data, and the 1 is because there is one channel in the input. You need to add another number before the first 32 to indicate batch size. In any neural network, you need to have batch size inside your input. Keras, PyTorch, or any other ML library does not handle this for you.

    Ex. let's consider the MNIST dataset. Each number has a shape of (28, 28, 1) but when we run the data through our network, we have to reshape it to (1, 28, 28, 1).