I added the class-attribute of my dataset like following (same for train- and testset):
ArrayList<String> nomValues = new ArrayList<>();
nomValues.add("1");
nomValues.add("0");
datasetBinary_train.insertAttributeAt(new Attribute("class", nomValues), datasetBinary_train.numAttributes());
So I assume the value 1 to be at position 0 and the value 0 at position 1.
Thus I assume that the double[]
I get with NominalPrediction.distribution()
will have the class-probability for class "1" at position 0.
Examining the classification-result it seems to be vice versa.
One Prediction is looking like this.
NOM: 1.0 0.0 1.0 0.6081479321383793 0.3918520678616207
where 1 is the actual class and 0 the predicted class (then the weight and next the distribution). I would think that a higher probability for the label "0" is observed which means that the probability for the instance labeled as "0" is shown at index 0
The Evaluation header says
@attribute (...)
@attribute class {1,0}
So until there it has the right order.
Can someone tell me how the attribute-values are sorted in the Evaluation? How to ensure that the right one is selected?
You're confusing labels with label indices (Weka uses 0-based indices internally to represent labels):
NOM: <actual label index> <predicted label index> <weight> <distribution>
The actual label index is 1.0 ("0"), the predicted index is 0.0 ("1"), the weight is 1.0 and the distribution is "0.61 0.39". Based on the distribution, the first label gets predicted (0.0 or "1").