I'm trying to use/import a J48 classifier made in WEKA 3.8.3 (the GUI one) in an Android app, but the resulting class only returns one classification result regardless of the incoming values.
The raw dataset looks like this, so the first answer I could find on SO was to use SMOTE to compensate for this. I couldn't find SMOTE, but from what I can see the ClassBalancer filter achieves the same results. The resulting data looks like this.
These are the results for the raw data, and these are the results for the filtered data.
The J48 class using the raw data only returns a value of 2.0, which would correspond with walking, which I'm guessing is due to the fact that walking is by far the most common classification in the training set. The J48 class using the filtered data only returns a value of 1.0, which I can't quite explain.
I've already tried loading a working classifier from a previous project into the app; this (J48) classifier did return different values. I've also tried manually removing rows of data until all activities had the same amount of entries, but this did not resolve the problem. I've also tried putting in a value I know should correspond to "sitting" manually, but this also did not work.
The instance sent to the classifier looks perfectly fine when I print out the value to the log, so I don't think it's something to do with the input. Just in case I'm missing something obvious though, I'll include the code below:
public double createInstances(double x, double y, double z){
double result = 0;
Instances dataRaw = new Instances("TestInstances", atts, 3);
dataRaw.setClassIndex(dataRaw.numAttributes()-1);
Log.d(TAG,"Rawdata:" + dataRaw);
Instance inst = new DenseInstance(3);
inst.setDataset(dataRaw);
inst.setValue(accel_x,x);
inst.setValue(accel_y,y);
inst.setValue(accel_z,z);
Log.d(TAG,"NEWDATARAW:" + inst);
try {
result = weka.classifyInstance(inst);
Log.d(TAG,"RESULT:" + result);
}
catch (Exception e){
Log.e(TAG,"DID NOT WORK:" + e);
}
return result;
}
I have also made sure that the attributes are getting added to the Attributes ArrayList "atts".
The only remaining option I can come up with is that there is something wrong with the data format, but I would imagine it would already show signs of problems when creating the algorithm then. This is an example of the ARFF file containing the data.
I've tried entering both datasets (raw/filtered) in a default J48 algorithm, and one with all pruning options disabled in case it was pruning the other activities, but none of these classifiers returned more than one activity.
I think that's all the relevant information, but if I missed anything please tell me so I can add it. I'm fairly sure the problem is somewhere in the classifier itself, but I have no clue what it is exactly.
Turns out WEKA placed the following code in the classifyInstance method:
// set class value to missing
s[i.classIndex()] = null;
This caused the last value in the array (in this case the accel_z value) to be set to null, resulting in the classifier always returning the same value, since this was the first check in the classifier (note the "if(i[2] == null" statement):
static double N69cd8e58128(Object []i) {
double p = Double.NaN;
if (i[2] == null) {
p = 1;
} else if (((Double) i[2]).doubleValue() <= 6.43) {
p = WekaClassifier.Nad3034129(i);
} else if (((Double) i[2]).doubleValue() > 6.43) {
p = WekaClassifier.N26ef16ed242(i);
}
return p;
}
I'm not sure what purpose this line normally serves, but removing it seems to have fixed the issue.