Search code examples
javaweka

Randomly sample a subset of attributes in Weka(Java)


I am working with the Weka API and I want to select a random subset of attributes from an Instances object. I am aware that the RandomSubset class exists which supposedly picks a random subset of attributes from the Instances object. However, this function does not seem to work. For example, from the code below, I tell the RandomSubset object to randomly select 7 attributes and use the filter class to filter my instances object, which originally has 24 attributes. I expect the output of the filter operation to give me a new instances object with just 7 randomly selected attributes but that does not happen. Instead, every time I run the code I get the SAME 12 selected attributes which tell me that RandomSubset is not random at all!

RandomSubset randomSubset = new RandomSubset();
randomSubset.setInputFormat(instances); // set input format
randomSubset.setNumAttributes(7); // select random number of attributes to pick
Instances sub = Filter.useFilter(instances,randomSubset); // pass randomSubset to filter object
System.out.println(sub) // contains 12 attributes instead of 7

How do I make this method work? Is this a bug?

Thank you, A desperate coder!


Solution

  • The setInputFormat call has to happen after all options have been set. This is where the filter will determine data structures etc based on the currently set options. Any options set after that call will simply get ignored. Think of the setInputFormat method as the Apply button in user interfaces.