I have a simple model in Keras 2.8.0 that takes a batch of images and outputs the pairwise distance of their embeddings:
inp = layers.Input((28, 28, 1))
x = layers.Conv2D(64, (3, 3), padding='same')(inp)
x = layers.MaxPooling2D()(x)
x = layers.Conv2D(64, (3, 3), padding='same')(x)
x = layers.MaxPooling2D()(x)
x = layers.Flatten()(x)
x = layers.Lambda(lambda y: K.sum((y-y[:, None])**2, axis=-1))(x)
model = models.Model(inp, x)
It works well when the batch size is less than or equal to 32, for example:
X = np.random.rand(32, 28, 28, 1).astype(np.float32)
model.predict(X).shape
# (32, 32)
However, when the batch size is greater than 32, I get an error:
X = np.random.rand(40, 28, 28, 1).astype(np.float32)
model.predict(X).shape
# InvalidArgumentError: ConcatOp : Dimension 1 in both shapes must be equal: shape[0] = [32,32] vs. shape[1] = [8,8] [Op:ConcatV2] name: concat
As I understand it, this is due to the fact that the Keras default layers' batch size is 32. But I haven't found how to change it or make it adaptive to the batch size. How can I change the Keras default batch size when compiling a model?
Your code works: model(X)
returns a (40, 40)
output.
The true issue is with model.predict(X)
: the concatenation error (ConcatOp
) is about the fact that .predict()
uses a default batch size of 32
(not the Keras layers) and so it's unable to concatenate two lists of shapes: [32,32]
and [8,8]
, because your pairwise-distance operation cannot be split in multiple batches!
To solve the issue either call model(X)
or use model.predict(X, batch_size=len(X))
.