machine-learning cross-validation k-fold

how to learn from each fold in the k-fold cross validation?

When performing k-fold cross-validation, for every fold, we have a different validation set and a slightly changed learning set. Say that you progress from the first fold to the second fold. How is what υοu learned from the first fold being inherited in the second fold iteration? Currently, it seems as you only calculate the accuracy and the model learned is discarded and never retained.

What am I missing? If such a model is retained? How is it retained and does the method differ for DQN from KNN?

Solution

K-fold cross-validation does not retrain models in each iteration. Instead, it trains and evaluates K different independent (could be parallelized) models with different folds of the dataset, but with the same hyper-parameters. This is not to get a more accurate model, but to get a more accurate (statistically speaking) validation by computing the aggregated validation score (i.e.: you can estimate the mean and stddev of the accuracy of the model).

You can then just keep one of those models and use the aggregated estimation for its metrics (instead of using the one computed in the specific fold for that model), or train (from scratch) a new model with the complete dataset. In this last case, your best estimation for the model metrics is still the previous aggregated metrics, but a new unused test set could be used to estimate a new metric. So, why would you do that? Well, that is because you normally use cross-validation combined with hyper-parameters tuning. So, every time you tune your hyperparameters, you just check the aggregated metric estimation with cross-validation. But, when you finish tuning your model, you compute the final metrics with the unseen test set.