Search code examples
javamachine-learningweka

WEKA voting algorithm classifications are inconsistent with probabilities?


My problem is with the WEKA Vote command using average of probabilities and the following classifiers:

  1. SMO
  2. MultilayerPerceptron
  3. J48

java -classpath weka.jar -Xmx1G weka.classifiers.meta.Vote -S 1 -p 0 -distribution -t train.arff -T test.arff -B "weka.classifiers.functions.SMO -C 1.0 -L 0.001 -P 1.0E-12 -N 0 -V -1 -W 1 -M -K \"weka.classifiers.functions.supportVector.PolyKernel -C 250007 -E 1.0\"" -B "weka.classifiers.functions.MultilayerPerceptron -L 0.3 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H a" -B "weka.classifiers.trees.J48 -C 0.25 -M 2" -R AVG

Weka results output:

inst#     actual  predicted error distribution
 1        1:?   1:active       *0.311,0.689 
 2        1:?   1:active       *0.807,0.193 
 3        1:?   1:active       *0.187,0.813

I'm a bit confused as to why instances 1 and 3 are classified as "active", when according to the distribution the alternative "inactive" has a higher probability.

For example, if line 1 were consistent I would imagine its results would be something like:

 1        1:?   2:inactive       0.311,*0.689

Any explanation or direction will be greatly appreciated.


Solution

  • As things turned out, the solution was to upgrade my version of WEKA. I was previously using version 3.5.6 and an upgrade to 3.7.2 did the trick for me.

    The results I'm receiving are now as expected:

     inst#     actual  predicted error distribution
     1        1:? 2:inactive       0.311,*0.689 
     2        1:?   1:active       *0.807,0.193 
     3        1:? 2:inactive       0.187,*0.813