tensorflow machine-learning keras image-classification batch-normalization

model.predict() - TensorFlow Keras gives same output for all images when the dataset size increases?

I have been trying to use a pre-trained model(XceptionNet) to get a feature vector corresponding to each input image for a classification task. But am stuck as the model.predict() gives unreliable and varying output vector for the same image when the dataset size changes.

In the following code, batch is the data containing images and for each of these images I want a feature vector which I am obtaining using the pre-trained model.

batch.shape
TensorShape([803, 800, 600, 3])

Just to make it clear that all the input images are different here are few of the input images displayed.

plt.imshow(batch[-23])
plt.figure()
plt.imshow(batch[-15])

My model is the following

model_xception = Xception(weights="imagenet", input_shape=(*INPUT_SHAPE, 3), include_top=False)
model_xception.trainable = False
inp = Input(shape=(*INPUT_SHAPE, 3)) # INPUT_SHAPE=(800, 600)
out = model_xception(inp, training=False)
output = GlobalAvgPool2D()(out)
model = tf.keras.Model(inp, output, name='Xception-kPiece')

Now the issue is presented in the following code outputs

model.predict(batch[-25:]) # prediction on the last 25 images

1/1 [==============================] - 1s 868ms/step

array([[4.99584060e-03, 4.25433293e-02, 9.93836671e-02, ...,
        3.21301445e-03, 2.59823762e-02, 9.08260979e-03],
       [2.50613055e-04, 1.18759666e-02, 0.00000000e+00, ...,
        1.77203789e-02, 7.71604702e-02, 1.28602296e-01],
       [3.41954082e-02, 1.82092339e-02, 5.07147610e-03, ...,
        7.09404126e-02, 9.45318267e-02, 2.69510925e-01],
       ...,
       [0.00000000e+00, 5.16504236e-03, 4.90547449e-04, ...,
        4.62833559e-04, 9.43152513e-03, 1.17826145e-02],
       [0.00000000e+00, 4.64747474e-03, 0.00000000e+00, ...,
        1.21422185e-04, 4.47714329e-03, 1.92385539e-02],
       [0.00000000e+00, 1.29655155e-03, 4.02751788e-02, ...,
        0.00000000e+00, 0.00000000e+00, 3.20959717e-01]], dtype=float32)

model.predict(batch)[-25:] # prediction on entire dataset of 803 images and then extracting the vectors corresponding to the last 25 images

26/26 [==============================] - 34s 1s/step

array([[1.7320104e-05, 3.6561250e-04, 0.0000000e+00, ..., 0.0000000e+00,
        3.5924271e-02, 0.0000000e+00],
       [1.7320104e-05, 3.6561250e-04, 0.0000000e+00, ..., 0.0000000e+00,
        3.5924271e-02, 0.0000000e+00],
       [1.7320104e-05, 3.6561250e-04, 0.0000000e+00, ..., 0.0000000e+00,
        3.5924271e-02, 0.0000000e+00],
       ...,
       [1.7318112e-05, 3.6561041e-04, 0.0000000e+00, ..., 0.0000000e+00,
        3.5924841e-02, 0.0000000e+00],
       [1.7318112e-05, 3.6561041e-04, 0.0000000e+00, ..., 0.0000000e+00,
        3.5924841e-02, 0.0000000e+00],
       [1.7318112e-05, 3.6561041e-04, 0.0000000e+00, ..., 0.0000000e+00,
        3.5924841e-02, 0.0000000e+00]], dtype=float32)

There are two problems in such a behavior:

Both the outputs are not same, but the last 25 input images are same.
The output for each input image in the larger batch is same.

My take on the problem:

I feel like the BatchNormalization layers are causing the issue. But what is the fix? I am passing argument in the model_xception for training=False and also model_xception.trainable=False still the output is same for all the inputs.
The increase in number of images in the batch is the problem.
Not only for XceptionNet for all other models this issue is evident. I have also experimented with EfficientNetV2 models.

Can anyone help fix the bug?

Solution

The issue seems to be appearing cause I am using tensorflow-macos which has this major bug of predictions which are wrong for exceeding a particular number of input images.

See the issue in action below:

When 57 input images are used then the predictions are different and same as 56, ..., 1 input image (which is consistent behavior and as expected).

model.predict(batch[-57:])

1/1 [==============================] - 2s 2s/step

array([[0.00000000e+00, 2.56574154e-02, 1.79693177e-01, ...,
        2.85670068e-03, 1.08444700e-02, 2.34257965e-03],
       [0.00000000e+00, 1.28444552e-03, 0.00000000e+00, ...,
        4.11680201e-03, 4.49061068e-03, 1.83695972e-01],
       [0.00000000e+00, 2.29660165e-03, 7.84890354e-03, ...,
        1.86224483e-04, 1.81426702e-03, 1.54079705e-01],
       ...,
       [0.00000000e+00, 5.16504236e-03, 4.90547449e-04, ...,
        4.62833559e-04, 9.43152513e-03, 1.17826145e-02],
       [0.00000000e+00, 4.64747474e-03, 0.00000000e+00, ...,
        1.21422185e-04, 4.47714329e-03, 1.92385539e-02],
       [0.00000000e+00, 1.29655155e-03, 4.02751788e-02, ...,
        0.00000000e+00, 0.00000000e+00, 3.20959717e-01]], dtype=float32)

model.predict(batch[-55:])

2/2 [==============================] - 2s 1s/step

array([[0.00000000e+00, 2.29660165e-03, 7.84890354e-03, ...,
        1.86224483e-04, 1.81426702e-03, 1.54079705e-01],
       [4.94572960e-05, 8.04292504e-04, 5.08825444e-02, ...,
        4.58029518e-03, 2.09121332e-02, 5.57549708e-02],
       [0.00000000e+00, 1.62312540e-03, 0.00000000e+00, ...,
        4.35817856e-05, 2.16606092e-02, 1.30677417e-01],
       ...,
       [0.00000000e+00, 5.16504236e-03, 4.90547449e-04, ...,
        4.62833559e-04, 9.43152513e-03, 1.17826145e-02],
       [0.00000000e+00, 4.64747474e-03, 0.00000000e+00, ...,
        1.21422185e-04, 4.47714329e-03, 1.92385539e-02],
       [0.00000000e+00, 1.29655155e-03, 4.02751788e-02, ...,
        0.00000000e+00, 0.00000000e+00, 3.20959717e-01]], dtype=float32)

But when the input images is changed to 58 or more there is the above mentioned issue.

model.predict(batch[-58:])

1/1 [==============================] - 2s 2s/step

array([[5.3905282e-04, 2.8516021e-02, 1.2775734e-03, ..., 5.4674568e-03,
        1.7451918e-02, 9.4717339e-02],
       [0.0000000e+00, 2.8345605e-02, 1.2786543e-03, ..., 0.0000000e+00,
        2.4870334e-03, 1.2716405e-01],
       [4.3588653e-03, 8.2868971e-02, 1.8764129e-02, ..., 2.5320805e-03,
        5.9973758e-02, 6.9927111e-02],
       ...,
       [1.7320104e-05, 3.6561250e-04, 0.0000000e+00, ..., 0.0000000e+00,
        3.5924271e-02, 0.0000000e+00],
       [1.7320104e-05, 3.6561250e-04, 0.0000000e+00, ..., 0.0000000e+00,
        3.5924271e-02, 0.0000000e+00],
       [1.7320104e-05, 3.6561250e-04, 0.0000000e+00, ..., 0.0000000e+00,
        3.5924271e-02, 0.0000000e+00]], dtype=float32)

If anyone could suggest a fix or workaround while still using tensorflow on mac it would be really helpful.

There is also a github issue which is still not fixed here.