python tensorflow machine-learning keras autoencoder

Disappearing Dimensions in Multi-Output Keras Model

When I try to train the autoencoder described below, I receive an error that ' A target array with shape (256, 28, 28, 1) was passed for an output of shape (None, 0, 28, 1) while using as loss `binary_crossentropy. This loss expects targets to have the same shape as the output.' The input and output dimensions should both be (28,28,1) with 256 being the batch size. Running .summary() confirms the output of the decoder model is the correct (28,28,1) but this seemingly changes when the encoder and decoder are compiled together. Any idea what's happening here? The three functions are called in succession when the network is generated.

def buildEncoder():
    input1 = Input(shape=(28,28,1))
    input2 = Input(shape=(28,28,1))
    merge = concatenate([input1,input2])
    convEncode1 = Conv2D(16, (3,3), activation = 'relu', padding = 'same')(merge)
    maxPoolEncode1 = MaxPooling2D(pool_size=(2, 1))(convEncode1)
    convEncode2 = Conv2D(16, (3,3), activation = 'sigmoid', padding = 'same')(maxPoolEncode1)
    convEncode3 = Conv2D(1, (3,3), activation = 'sigmoid', padding = 'same')(convEncode2)
    model = Model(inputs = [input1,input2], outputs = convEncode3)
    model.compile(loss='binary_crossentropy', optimizer=adam)
    return model

def buildDecoder():
    input1 = Input(shape=(28,28,1))
    upsample1 = UpSampling2D((2,1))(input1)
    convDecode1 = Conv2D(16, (3,3), activation = 'relu', padding = 'same')(upsample1)
    crop1 = Cropping2D(cropping = ((0,28),(0,0)))(convDecode1)
    crop2 = Cropping2D(cropping = ((28,0),(0,0)))(convDecode1)
    convDecode2_1 = Conv2D(16, (3,3), activation = 'relu', padding = 'same')(crop1)
    convDecode3_1 = Conv2D(16, (3,3), activation = 'relu', padding = 'same')(crop2)
    convDecode2_2 = Conv2D(1, (3,3), activation = 'sigmoid', padding = 'same')(convDecode2_1)
    convDecode3_2 = Conv2D(1, (3,3), activation = 'sigmoid', padding = 'same')(convDecode3_1)
    model = Model(inputs=input1, outputs=[convDecode2_2,convDecode3_2])
    model.compile(loss='binary_crossentropy', optimizer=adam)
    return model

def buildAutoencoder():   
    autoInput1 = Input(shape=(28,28,1))
    autoInput2 = Input(shape=(28,28,1))
    encode = encoder([autoInput1,autoInput2])
    decode = decoder(encode)
    model = Model(inputs=[autoInput1,autoInput2], outputs=[decode[0],decode[1]])
    model.compile(loss='binary_crossentropy', optimizer=adam)
    return model

Running the model.summary() function confirms that the final output dimensions of this

Solution

It looks like you have shape mis-calculation in your encoder. You assume the decoder will get (None, 28, 28, 1) but your encoder actually outputs (None, 14, 28, 28, 1).

print(encoder) # Tensor("model_1/conv2d_3/Sigmoid:0", shape=(?, 14, 28, 1), dtype=float32)

Now in your decoder you are cropping etc assuming you have (28, 28, 1) which presumably chops it to 0. The models work on their own, the mis-match happens when you connect them.