Search code examples
javabooleanweka

count boolean values that equal between two strings


I want to count the numbers of true values between two String from my training data, however, the code I implemented only counts the number of instances that are true as opposed to the total sum that are true.

//Load dataset
public class DatasetLoading {

  public static Instances loadData(String location) {
    try {
      return DataSource.read(location);
    }
    catch (Exception e) {
      System.err.println("Failed to load data from: " + location);
      e.printStackTrace();
      return null;
    }
  }

  public static void main(String[] args) {
    String dataLocation = "C:/Users/Emil/Desktop/Machine Learning - Java/Week 1/Arsenal_TRAIN1.arff";
    Instances train = loadData(dataLocation);
    System.out.println(train);
  }
}


public class ForClassifier {
    public static void main(String[] args) throws Exception {
        String train1 = "C:/Users/Emil/Downloads/Week 1/Arsenal_TRAIN.arff";

       Instances train = DatasetLoading.loadData(train1);

//train data    
    train.setClassIndex(train.numAttributes()-1);

    Classifier Model = (Classifier)new NaiveBayes();
    Model.buildClassifier(train);

    int z=0;
    double x = 0;
    String x2 = null;
    for (int i = 0; i < train.numInstances(); i++)      
    {
         //return data
      String  trueClassLabel = train.instance(i).toString(train.classIndex());
        double predicted = Model.classifyInstance(train.get(i));
       
  
        if(predicted == 0.0) {
            x=predicted;
        }else if (predicted == 1.0){
            x=predicted;

        }else if(predicted == 2.0) {
            x=predicted;
        }
       
        if(x == 0.0) {
        String x1 = "Loss";
        x2 = x1;
        } else if(x == 1.0) {
            String x1 = "Draw";
            x2=x1;
        } else if(x == 2.0) {
            String x1 = "Win";
            x2=x1;
        }
        
       //System.out.println(x2 + "\t"+trueClassLabel + "\t" + x2.equals(trueClassLabel));
        
        if(x2.equals(trueClassLabel)) {
            z++;
            System.out.println(z);
 }}}
    

The output that I get:

1
2
3
4
5
6
7
8
9
10
11
12
13

The expected output:

13

I have also tried getting the maximum value however, this returns 1 and not 13:

 if(x2.equals(trueClassLabel)) {
            
            z++;
            Integer[] test2= {z};
            
            for(int j = 0; j<test2.length;j++) {
                if(test2[max] < test2[i]) {
                    max=i;
                    

                }
            }System.out.println(test2[max]);//1

@data:

@RELATION Arsenal

@ATTRIBUTE Leno  {0,1}
@ATTRIBUTE Tierney   {0,1}
@ATTRIBUTE Saka  {0,1}
@ATTRIBUTE class    {Loss,Draw,Win}
@DATA

1, 0,  0,  Loss
1, 0,  0,  Loss
0, 1,  1,  Draw
1, 0,  1,  Draw
0, 0,  1,  Win
0, 1,  1,  Win
1, 1,  1,  Win
0, 1,  1,  Win
1, 1,  0,  Win
1, 0,  1,  Win
1, 1,  0,  Loss
0, 1,  0,  Draw
1, 1,  0,  Draw
1, 1,  0,  Draw
0, 0,  1,  Win
1, 0,  1,  Win
0, 1,  1,  Win
1, 1,  0,  Win
1, 1,  1,  Win
1, 1,  0,  Win

Solution

  • Instead of comparing strings, why don't you just compare the numeric prediction obtained from classifyInstance with the actual numeric class label from the training data (train.instance(i).classValue())?

    Since you didn't post your full code (the DatasetLoading class is missing), here is a simple rewrite of your code. The class expects the filename of the dataset to use as the first parameter. This class uses two approaches for evaluating the model: manual comparison of the predictions and using Weka's Evaluation class (which gives you a whole lot more statistics).

    import weka.classifiers.Classifier;
    import weka.classifiers.Evaluation;
    import weka.classifiers.bayes.NaiveBayes;
    import weka.core.Instances;
    import weka.core.converters.ConverterUtils.DataSource;
    
    public class ForClassifier {
    
      public static void main(String[] args) throws Exception {
        // load dataset
        Instances train = DataSource.read(args[0]);
        train.setClassIndex(train.numAttributes() - 1);
    
        // build classifier
        Classifier model = new NaiveBayes();
        model.buildClassifier(train);
    
        // 1. manual evaluation
        System.out.println("manual evaluation");
        int correct = 0;
        int incorrect = 0;
        for (int i = 0; i < train.numInstances(); i++) {
          double actual = train.instance(i).classValue();
          double predicted = model.classifyInstance(train.get(i));
          if (actual == predicted)
            correct++;
          else
            incorrect++;
        }
        System.out.println("- correct: " + correct);
        System.out.println("- incorrect: " + incorrect);
    
        // 2. using Weka's Evaluation class
        System.out.println("Weka's Evaluation");
        Evaluation eval = new Evaluation(train);
        eval.evaluateModel(model, train);
        System.out.println("- correct: " + eval.correct());
        System.out.println("- incorrect: " + eval.incorrect());
      }
    }
    

    BTW: You should never evaluate on the training data, as this will be overly optimistic (the model has already seen all this data!).