Search code examples
machine-learningpytorchtraining-dataoverfitting-underfitting

Does my model overfits or underfits based on the learning curves I obtain?


I want to find a good fit for my data, so I started with training the basic model (for the simple binary classification problem) and plotted the learning curves from the training. The plot I got:

Learning curves for training of my model

However, I am not sure what these curves mean, does the model overfit or underfit? Seems like there is a quite big gap between the training and validation curve, so I assumed it's overfitting and I added some regularization, however that seemed to only decrease both validation and test set accuracy and the gap remained more-less the same. So is this plot showing bias then?


Solution

  • When both curves have reached a plateau, are close and fairly high then it's underfitting.

    When both curves are separated by substantial gap and model performs significantly better on training data (training curve is fairly low) than on validation curve (fairly high in graph) then it's overfitting.

    Ideally, you'd want both curves to stay low in the graph and very close to each other with not much gap in between.