To conduct a cross-validation (resampling) in mlr
R package, normally we need to call makeResampleDesc
function to specify the methods and folds.
My questions are:
makeResampleDesc
in mlr
makes sure that the folds created are consistent (between different learners under the same seed of cause), and can be exported for further manipulation?The resample description is independent of any learner; you can use one with several learners and get the same folds. You can also extract the fold number from the resample result if you want to link them back to the original data.
You can use a column in the data as the fold column using the blocking
argument to makeClassifTask
. From the help:
blocking: [‘factor’]
An optional factor of the same length as the number of observations. Observations with the same blocking level “belong together”. Specifically, they are either put all in the training or the test set during a resampling iteration. Default is ‘NULL’ which means no blocking.