Search code examples
matlabmachine-learningcross-validationsampling

How to perform stratified 10 fold cross validation for classification in MATLAB?


My implementation of usual K-fold cross-validation is pretty much like:

K = 10;
CrossValIndices = crossvalind('Kfold', size(B,2), K);

for i = 1: K
    display(['Cross validation, folds ' num2str(i)])
    IndicesI = CrossValIndices==i;
    TempInd = CrossValIndices;
    TempInd(IndicesI) = [];
    xTraining = B(:, CrossValIndices~=i);
    tTrain = T_new1(:, CrossValIndices~=i);

    xTest = B(:, CrossValIndices ==i);
    tTest = T_new1(:, CrossValIndices ==i);
end

But To ensure that the training, testing, and validating dataset have similar proportions of classes (e.g., 20 classes).I want use stratified sampling technique.Basic purpose is to avoid class imbalance problem.I know about SMOTE technique but i want to apply this one.


Solution

  • You can simply use crossvalind('Kfold', Group, K), where Group is the vector containing the class label for each observation. This will lead to sets where each group is proportionally abundant.