Search code examples
pythonmachine-learningcomputer-visionconv-neural-networkoverfitting-underfitting

Validation accuracy doesn't change at all while training a CNN network


So, I was trying to implement AlexNet on the Intel image dataset for classification. However, although during training I get high accuracy scores (0.84), validation accuracy does not change and it is very low (0.16). I have tried different optimizers and learning rates and it didn't help.

Thank you for your help.

Data consist of 14k training and 3k test data, there are 6 classes. Here are the shapes of the datasets:

Train X shape:  (14034, 150, 150, 3) 
Test X shape:  (3000, 150, 150, 3) 
Train Y shape:  (14034, 6) 
Test Y shape:  (3000, 6)

Here is the code:

# Creating AlexNet network
model = keras.Sequential()

# Layer 1
model.add(Conv2D(96, (11, 11), strides=4, activation='relu', input_shape=(150, 150, 3)))
model.add(MaxPool2D(pool_size=(3, 3), strides=2))

# Layer 2
model.add(Conv2D(256, (5, 5), strides=1, padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(3, 3), strides=2))

# Layer 3, 4, and 5
model.add(Conv2D(384, (3, 3), strides=1, padding='same', activation='relu'))
model.add(Conv2D(384, (3, 3), strides=1, padding='same', activation='relu'))
model.add(Conv2D(256, (3, 3), strides=1, padding='same', activation='relu'))
model.add(MaxPool2D(pool_size=(3, 3), strides=2))

# Layer 6
model.add(Flatten())
model.add(Dense(4096))

# Layer 7
model.add(Dropout(0.5))
model.add(Dense(4096))

# Layer 8
model.add(Dropout(0.5))
model.add(Dense(6, activation='softmax'))

opt = SGD(learning_rate=0.01)
model.compile(optimizer=opt,
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])
train_dataset = tf.data.Dataset.from_tensor_slices((train_X, train_Y)).batch(128)
test_dataset = tf.data.Dataset.from_tensor_slices((test_X, test_Y)).batch(128)
history = model.fit(train_dataset, epochs=70, validation_data=test_dataset, shuffle=True)

Here is the output:

Epoch 1/70
110/110 [==============================] - 413s 4s/step - loss: 0.5867 - accuracy: 0.8835 - val_loss: 6.4170 - val_accuracy: 0.1670
Epoch 2/70
110/110 [==============================] - 421s 4s/step - loss: 0.8547 - accuracy: 0.7973 - val_loss: 5.3743 - val_accuracy: 0.1670
Epoch 3/70
 67/110 [=================>............] - ETA: 2:43 - loss: 0.8841 - accuracy: 0.7100 - val_loss: 4.8517 - val_accuracy: 0.1670

val_accuracy does not change at all.


Solution

  • Apparently the model didn't even train on the validation set. I have solved this issue by using Adam optimizer instead of SGD and using a smaller learning rate, 0.001 instead of 0.01.