Search code examples
pythonmachine-learningkeraslosscross-entropy

Keras loss consistently low but accuracy starts high then drops


First off my assumptions might be wrong:

  1. Loss is how far from the correct answer each training example is (then divided by the number of examples - kind of a mean loss).
  2. Accuracy is how many training examples are correct (if the highest output is taken as the correct answer then it doesn't matter if it's 0.7 which would give a loss of 0.3, it still outputs the correct answer). This is given as a percentage.

To my eye that means that accuracy will typically be closer to 100% than loss will be to 0. This is not what I'm seeing:

10000/10000 [==============================] - 1067s - loss: 0.0408 - acc: 0.9577 - val_loss: 0.0029 - val_acc: 0.9995
Epoch 2/5
10000/10000 [==============================] - 991s - loss: 0.0021 - acc: 0.9997 - val_loss: 1.9070e-07 - val_acc: 1.0000
Epoch 3/5
10000/10000 [==============================] - 990s - loss: 0.0011 - acc: 0.4531 - val_loss: 1.1921e-07 - val_acc: 0.2440

That's on 3 epochs, the second attempt at getting this working. This is with the train_dategen having shuffle=True. I have results with shuffle=False (I initially thought this might be the issue), here:

10000/10000 [==============================] - 1168s - loss: 0.0079 - acc: 0.9975 - val_loss: 0.0031 - val_acc: 0.9995
Epoch 2/5
10000/10000 [==============================] - 1053s - loss: 0.0032 - acc: 0.9614 - val_loss: 1.1921e-07 - val_acc: 0.2439
Epoch 3/5
10000/10000 [==============================] - 1029s - loss: 1.1921e-07 - acc: 0.2443 - val_loss: 1.1921e-07 - val_acc: 0.2438
Epoch 4/5
10000/10000 [==============================] - 1017s - loss: 1.1921e-07 - acc: 0.2439 - val_loss: 1.1921e-07 - val_acc: 0.2438
Epoch 5/5
10000/10000 [==============================] - 1041s - loss: 1.1921e-07 - acc: 0.2445 - val_loss: 1.1921e-07 - val_acc: 0.2435

I use categorical_crossentropy for loss, since I have 3 classes. I have more data than needed (about 178,000 images, all classified into 1 of 3 classes).

Am I misunderstanding something, or has something gone wrong?

Here's my full code:

# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense

# Initialising the CNN
classifier = Sequential()
# Step 1 - Convolution
classifier.add(Conv2D(32, (3, 3), input_shape = (200, 200, 3), activation = 'relu'))
# Step 2 - Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Adding a second convolutional layer
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# Step 3 - Flattening
classifier.add(Flatten())
# Step 4 - Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dense(units = 3, activation = 'sigmoid'))
# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Part 2 - Fitting the CNN to the images
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set',
                target_size = (200, 200),
                batch_size = 64,
                class_mode = 'categorical',
                shuffle=True)

test_set = test_datagen.flow_from_directory('dataset/test_set',
                target_size = (200, 200),
                batch_size = 62,
                class_mode = 'categorical',
                shuffle=True)

classifier.fit_generator(training_set,
                steps_per_epoch = 10000,
                epochs = 5,
                validation_data = test_set,
                validation_steps=1000)

classifier.save("CSGOHeads.h5")
# Part 3 - Making new predictions
import numpy as np
from keras.preprocessing import image
test_image = image.load_img('dataset/single_prediction/1.bmp', target_size = (200, 200))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = classifier.predict(test_image)
training_set.class_indices
if result[0][0] == 1:
    prediction = 'head'
else:
    prediction = 'not'

Solution

  • Since you are classifying images into one of 3 classes (i.e. which is called single-label multi-class classification: there are multiple classes but each image has only one label) you should use softmax as the activation function of last layer instead of using sigmoid:

    classifier.add(Dense(units = 3, activation = 'softmax')) # don't use sigmoid here
    

    If you want me to explain more, let me know and I will update my answer.