Keras loss of inference and forward propagation don't match

enter image description here

I am using Keras pre-trained model ResNet50 to train my own dateset, which contains only one image for testing purpose. First, I evaluate the model with my image and get a loss of 0.5 and an accuracy of 1. Then, I fit the model and get a loss of 6 and an accuracy of 0. I don't understand why the loss of inference and forward propagation don't match. It seems like the behaviors of inference and forward propagation in Keras are different. I have attached my code snippet and the screenshot of it.

model = ResNet50(weights='imagenet')

img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

y = np.zeros((1, 1000))
y[0, 386] = 1

model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['categorical_accuracy'])

model.evaluate(x, y)

1/1 [==============================] - 1s 547ms/step [0.5232877135276794, 1.0]

model.fit(x, y, validation_data=(x, y))

Train on 1 samples, validate on 1 samples Epoch 1/1 1/1 [==============================] - 3s 3s/step - loss: 6.1883 - categorical_accuracy: 0.0000e+00 - val_loss: 9.8371e-04 - val_categorical_accuracy: 1.0000

model.evaluate(x, y)

1/1 [==============================] - 0s 74ms/step [0.0009837078396230936, 1.0]

Solution

Sorry for misunderstanding the question at first. The problem is very tricky. And the problem is likely caused by BatchNorm layer as @Natthaphon mentioned in the comments, because I have tried on VGG16, the losses are matched.

Then I tested in ResNet50, and eval loss and fit loss are still not matched even though I "freeze" all layers. Actually, I manually check the BN weights, they're indeed not changed.

from keras.applications import ResNet50, VGG16
from keras.applications.resnet50 import preprocess_input
from keras_preprocessing import image
import keras
from keras import backend as K
import numpy as np

img_path = '/home/zhihao/Downloads/elephant.jpeg'
img = image.load_img(img_path, target_size=(224, 224))

model = ResNet50(weights='imagenet')

for layer in model.layers:
    layer.trainable = False

x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

y = np.zeros((1, 1000))
y[0, 386] = 1

model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['categorical_accuracy'])

model.evaluate(x, y)
# 1/1 [==============================] - 2s 2s/step
# [0.2981376349925995, 1.0]

model.fit(x, y, validation_data=(x, y))
# Train on 1 samples, validate on 1 samples
# Epoch 1/1
# 1/1 [==============================] - 1s 549ms/step - loss: 5.3056 - categorical_accuracy: 0.0000e+00 - val_loss: 0.2981 - val_categorical_accuracy: 1.0000

We can notice that eval loss is 0.2981 and fit loss is 5.3056. I guess Batch Norm layers have different behaviors between eval mode and train mode. Correct me if I'm wrong.

One way to really freeze the model I found is using K.set_learning_phase(0) as following,

model = ResNet50(weights='imagenet')

K.set_learning_phase(0)  # all new operations will be in test mode from now on

model.fit(x, y, validation_data=(x, y))

# Train on 1 samples, validate on 1 samples
# Epoch 1/1
# 1/1 [==============================] - 4s 4s/step - loss: 0.2981 - categorical_accuracy: 1.0000 - val_loss: 16.1181 - val_categorical_accuracy: 0.0000e+00

Now the two losses are matched.