Search code examples
javaweka

Java/WEKA: K Clustering error: Cannot handle any class attribute


    SimpleKMeans kmeans = new SimpleKMeans();
    int numberOfClusters = 2;
    int[] assignments = null;

    kmeans.setSeed(10);

    // This is the important parameter to set
    kmeans.setPreserveInstancesOrder(true);
    try {
        kmeans.setNumClusters(numberOfClusters);
        kmeans.buildClusterer(instancesOne); // <-- exception being thrown
        // This array returns the cluster number (starting with 0) for each instance
        // The array has as many elements as the number of instances
        assignments = kmeans.getAssignments();
    } catch (Exception e) {
        e.printStackTrace();
    }

I am trying to initialize the parameters of a EM algorithm by a k-means algorithm. So I am trying to get 2 centroids in which I can further train parameters for a GMM. I am however receiving the following error:

weka.core.WekaException: weka.clusterers.SimpleKMeans: Cannot handle any class attribute!
    at weka.core.Capabilities.test(Unknown Source)
    at weka.core.Capabilities.test(Unknown Source)
    at weka.core.Capabilities.testWithFail(Unknown Source)
    at weka.clusterers.SimpleKMeans.buildClusterer(Unknown Source)
    at hmm.HMM.run(HMM.java:62)
    at hmm.HMM.main(HMM.java:22)
Exception in thread "main" java.lang.NullPointerException
    at hmm.HMM.run(HMM.java:71)
    at hmm.HMM.main(HMM.java:22)

Also how do I set two random centroids to begin with. I think the setSeed() method does that but how Do I get it to word with my dataset? My csv file looks as so:

enter image description here

And load it so:

Instances instancesOne = loader.loadCsv("train", "class1");

Here is some information about the attributes when loaded:

dataset:

@relation class1

@attribute x numeric
@attribute y numeric

@data
-9.0278,3.1518
-9.5656,3.6383
-9.805,3.8284
etc...

Answer, this code was needed to make the Instances class-less(remove the class attribute):

// remove class attribute, make class-less
Instances dataClusterer = null;
weka.filters.unsupervised.attribute.Remove filter = new weka.filters.unsupervised.attribute.Remove();
filter.setAttributeIndices("" + (instancesOne.classIndex() + 1));
try {
    filter.setInputFormat(instancesOne);
    dataClusterer = Filter.useFilter(instancesOne, filter);
} catch (Exception e1) {
    e1.printStackTrace();
    return;
}

Solution

  • I don't believe that K-Means Clustering requires a class attribute. If you have set one for your instances, please try to remove it and rerun the code. This guide may assist in methods for building a clustering model.

    Hope this helps!