WEKA Cross Validation:
Classifier cls = new J48();
Evaluation eval = new Evaluation(data);
Random rand = new Random(1); // using seed = 1
int folds = 10;
eval.crossValidateModel(cls, data, folds, rand);
System.out.println(eval.toSummaryString());
What does it mean "rand"? How does cross validation in this case? 10 folds are always mixed?
Thank you!
What does it mean "rand"?
Rand is an object instance that will randomize the dataset for you. This is used for cross validation purposes. The seed is a component of the randomness.
How does cross validation in this case?
The data set is mixed so that for example if you had data rows (1-100) in order, the data would be randomized so the first 5 might be (77,12,4,7,55) instead of (1,2,3,4,5)
10 folds are always mixed?
It depends on the tools or libraries you use but I don't think so with WEKA. I think it is just taking 1-10 and makes it a set 11-20 and make that a set and so on. This causes bias especially if the data grouped together in a file has similar characteristics. That is why data is best randomized.