I'm trying to make Image Classifier for 7 classes using transfer learning with Xception. and now I'm trying to implement cross-validation. I know KFold return indices but how can I get the data value.
from sklearn.model_selection import KFold
import numpy as np
sample = np.array(['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I'])
kf = KFold(n_splits=3, shuffle=True)
for train_index, test_index in kf.split(sample):
print("TRAIN:", train_index, "TEST:", test_index)
It return
TRAIN: [1 2 3 4 6 7] TEST: [0 5 8]
TRAIN: [0 1 2 4 5 8] TEST: [3 6 7]
TRAIN: [0 3 5 6 7 8] TEST: [1 2 4]
But what I want is
TRAIN: ['B', 'C', 'D', 'E', 'G', 'H'] TEST: ['A', 'F', 'I']
TRAIN: ['A', 'B', 'C', 'E', 'F', 'I'] TEST: ['D', 'G', 'H']
TRAIN: ['A', 'D', 'F', 'G', 'H', 'I'] TEST: ['B', 'C', 'E']
What should I do?
kf.split
returns the indices, not the actual samples. You only need to change to:
for train_index, test_index in kf.split(sample):
print("TRAIN:", sample[train_index], "TEST:", sample[test_index])
Result:
TRAIN: ['A' 'B' 'C' 'E' 'F' 'H'] TEST: ['D' 'G' 'I']
TRAIN: ['A' 'D' 'F' 'G' 'H' 'I'] TEST: ['B' 'C' 'E']
TRAIN: ['B' 'C' 'D' 'E' 'G' 'I'] TEST: ['A' 'F' 'H']