Search code examples
javamachine-learningclassificationwekaprediction

using the saved model for predicting using weka (Eclipse+Java)


I was confused with the arguments of the lines "Instances originalTrain=" can anyone please help me to correct this error since I was new to this weka. We are creating a disease prediction system using weka in java.

import weka.classifiers.Classifier;
import weka.core.Instances;

public class Main {

    public static void main(String[] args) throws Exception
    {
        String rootPath="/some/where/"; 
        Instances originalTrain= //instances here (don't know to complete this statement)

        //load model
        Classifier cls = (Classifier) weka.core.SerializationHelper.read(rootPath+"tree.model");

        //predict instance class values
        Instances originalTrain= //load or create Instances to predict (This statement too)

        //which instance to predict class value
        int s1=0;

        //perform your prediction
        double value=cls.classifyInstance(originalTrain.instance(s1));

        //get the prediction percentage or distribution
        double[] percentage=cls.distributionForInstance(originalTrain.instance(s1));

        //get the name of the class value
        String prediction=originalTrain.classAttribute().value((int)value); 

        System.out.println("The predicted value of instance "+
                            Integer.toString(s1)+
                            ": "+prediction); 

        //Format the distribution
        String distribution="";
        for(int i=0; i <percentage.length; i=i+1)
        {
            if(i==value)
            {
                distribution=distribution+"*"+Double.toString(percentage[i])+",";
            }
            else
            {
                distribution=distribution+Double.toString(percentage[i])+",";
            }
        }
        distribution=distribution.substring(0, distribution.length()-1);

        System.out.println("Distribution:"+ distribution);
    }

}

Solution

  • For completeness, the code snippet in the question originates from Get prediction percentage in WEKA using own Java code and a model.

    originalTrain should be your training instances. There are two ways that I know to add instances to originalTrain.

    1. This method loads data from an .arff file and is based on instructions found here.

      // rootPath should be where the .arff file is held
      // filename should hold the complete name of the .arff file
      public static Instances instanceData(String rootPath, String filename) throws Exception
      { 
        // initialize source 
        DataSource source = null;
        Instances data = null;
        source = new DataSource(rootPath + filename);
        data = source.getDataSet();
      // set the class to the last attribute of the data (may need to tweak) if (data.classIndex() == -1) data.setClassIndex(data.numAttributes() -1 ); return data; }

    2. You can create and add instance manually as described in this answer Define input data for clustering using WEKA API .