Error when checking target: expected conv2d_29 to have 4 dimensions, but got array with shape (1255, 12)

I would like to train a deep learning model, where input image shape is (224,224,3) . And I would like to feed them into a u-net model.

After training I get the error : Error when checking target: expected conv2d_29 to have 4 dimensions, but got array with shape (1255, 12)

I'm confused since I'm sure the image array and label has no issue. Is the issue within the model? How should I resolve this?

The model is as below:

#def unet(pretrained_weights = None, input_size = (224,224,3)):
concat_axis = 3
input_size= Input((224,224,3))
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(input_size)
conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
#flat1 = Flatten()(pool1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1)
conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2)
conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3)
conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4)
drop4 = Dropout(0.5)(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)

conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4)
conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
drop5 = Dropout(0.5)(conv5)

up_conv5 = UpSampling2D(size=(2, 2),  data_format="channels_last")(conv5)
ch, cw = get_crop_shape(conv4, up_conv5)
crop_conv4 = Cropping2D(cropping=(ch,cw),  data_format="channels_last")(conv4)
up6   = concatenate([up_conv5, crop_conv4], axis=concat_axis)
conv6 = Conv2D(256, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up6)
conv6 = Conv2D(256, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv6)

up_conv6 = UpSampling2D(size=(2, 2), data_format="channels_last")(conv6)
ch, cw = get_crop_shape(conv3, up_conv6)
crop_conv3 = Cropping2D(cropping=(ch,cw), data_format="channels_last")(conv3)
up7   = concatenate([up_conv6, crop_conv3], axis=concat_axis)
conv7 = Conv2D(128, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up7)
conv7 = Conv2D(128, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv7)

up_conv7 = UpSampling2D(size=(2, 2), data_format="channels_last")(conv7)
ch, cw = get_crop_shape(conv2, up_conv7)
crop_conv2 = Cropping2D(cropping=(ch,cw), data_format="channels_last")(conv2)
up8   = concatenate([up_conv7, crop_conv2], axis=concat_axis)
conv8 = Conv2D(64, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up8)
conv8 = Conv2D(64, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv8)

up_conv8 = UpSampling2D(size=(2, 2), data_format="channels_last")(conv8)
ch, cw = get_crop_shape(conv1, up_conv8)
crop_conv1 = Cropping2D(cropping=(ch,cw), data_format="channels_last")(conv1)
up9   = concatenate([up_conv8, crop_conv1], axis=concat_axis)
conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(up9)
conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv9)

model = Model(inputs = input_size, outputs = conv9)

Solution

Since the model output's layer is conv layer, the output shape has 4 dimensions(Batch_size, width, height, channels). But you are feeding an array of shape (1255, 12). If the target label has a shape of (Batch_size, num_features) then the last layer's output should have a shape of (None, 12) or (Batch_size, 12). You have two options to deal with this situation.

Using dense layer after flattening the output of conv layer
Reshaping the output of conv layer to have the desired shape.

The choice depends on the problem you're dealing with. If the problem is classification, option one could be used to add softmax activation. With option 1 the modification to the code would be,

conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv9)
flatten1 = Flatten()(conv9)
dense1 = Dense(12, activation="softmax")(flatten1) # The choice  of the activation depends on the problem you are dealing with.
model = Model(inputs = input_size, outputs = dense1)

With option 2, the modification would be

conv9 = Conv2D(32, (3, 3), padding="same", activation="relu", kernel_initializer = 'he_normal')(conv9)
reshape1 = Reshape((12,)(conv9) # The choice  of the activation depends on the problem you are dealing with.
model = Model(inputs = input_size, outputs = reshape1)

N.B: When the Reshape layer is used to reshape tensor to (None, 12) shape be sure that the product of the output shape of the previous layer should be divisible by 12.