After each YOLOv5 training, two model files are saved: last.pt
and best.pt
. I'm aware that:
last.pt
is the latest saved checkpoint of the model. This will be updated after each epoch.best.pt
is the checkpoint that has the best validation loss so far. It is updated whenever the model fitness improves.Fitness is defined as a weighted combination of [email protected]
and [email protected]:0.95
metrics:
def fitness(x):
# Returns fitness (for use with results.txt or evolve.txt)
w = [0.0, 0.0, 0.1, 0.9] # weights for [P, R, [email protected], [email protected]:0.95]
return (x[:, :4] * w).sum(1)
My question is, if the training continued for too many epochs (and last.pt
is thus overfitted), is best.pt
then a checkpoint from when the training was not yet overfit? In other words, does best.pt
control for overfitting?
We can assume that the best.pt
has a good performance for non-training data when the model has a regularization. However, I saw some researchers who choose a model which doesn't have the best result in terms of the limitation of validation loss.
If you don't have a lot of costs to train your model, you could consider the option that I mentioned, otherwise, you could just pick up the best.pt
.
In addition, we choose some of the models to ensemble them. This can collaborate with the first, second, etc., Although the validation loss of models has a slight difference, they can produce better performance than just using the best model.