Search code examples
pythonkerasneural-network

My test loss is increasing but train loss is decreasing for Neural Network. What should i do?


My Neural Network

def buildModel(optimizer):
    model = tf.keras.models.Sequential([
    Dense(100, activation = 'relu'),
    Dense(82, activation = 'relu'),
    Dense(20, activation = 'relu'),
    Dense(6, activation = 'relu'),
    Dense(20, activation = 'softmax')
    ])
    model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

tf.keras.optimizers.legacy.Adam()

model = buildModel('adam')
history = model.fit(train_x,train_y_lst, validation_data=(test_x, test_y_lst),epochs = 50,batch_size = 32,verbose = 0)

Plotting

plt.figure()
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Loss Curves')
plt.legend()

# Plot training and validation accuracy
plt.figure()
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('Training and Validation Accuracy Curves')
plt.legend()

plt.show()

Loss vs Epochs

Accuracy for test is also bad

Accuracy vs Epoch

Any suggestions where i may be going wrong, I am new?

I was expecting for test loss to decrease just like train loss.

My test_x lookes like this

  -0.84335711]
 [-0.1898388  -1.4177287   0.24718753 ... -0.33010045  0.77921928
  -1.56813293]
 [ 0.51887204 -1.34965479  0.19069737 ...  0.56236361 -0.03741466
  -0.24596578]
 ...
 [-0.11631875  0.46366703 -1.04400684 ...  0.23282911 -2.10649511
  -0.41883463]
 [-1.03632829  0.05419996 -2.22371652 ...  0.47133847 -1.70391277
  -1.42387687]
 [-0.12011524 -0.72294703 -0.74587529 ...  0.11331488 -1.81362912
  -0.11828704]]

test_y_lst

array([[1, 0, 0, ..., 0, 0, 0],
       [1, 0, 0, ..., 0, 0, 0],
       [1, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]])

Multi classification problem.


Solution

  • Since it looks like that you are new to the concept, I will tell you some ways you can improve your result here or in general using NeuralNets.

    Your model is overfitting to the input train data. to avoid:

    1. Always scale your input data. ranging them between [0,1] or [-1,1] depending on your use case is 99% of the time good enough. It helps the back propagation.
    2. Use metrics knowingly. ReLu activation deletes all negative values from calculation matrix by setting them to zero. You have negative values being fed into the model. Although ReLu could be useful even with negative inputs, it depends on the use case. it is not the master key.
    3. Use Dropout. Simply by adding dropout you reduce huge parts of your overfitting. it randomly sets some coefficients of a that layer to zero. so it does not overfit to some specific part of your input. and the trained model is more generalised.
    4. early stopping. Training long doesn't always mean better model. you can set the training to be stopped earlier than your train&test accuracy start distancing each other.
    5. fair amount of data and feature for the task. if you are training a model for 20 different category in output, you input data must be sufficient enough for the model to generalise the features of inputs. also dimension of the input data (shows the amount of data in can present) must be well enough (not much or few).

    for start try to scale your input data between (0,1) and use dropout for one or two of your layers and see the result.