I've got a problem with setting the destination class for Instance object. Imagine such situation: I've got two regression results (containing slope and intercept) Given that, I set first four attributes to some doubles, and the last attribute, which is the destination attribute is set by the index, not by the value.
Here's how it looks like in code:
for (RegressionArffRow row : input) {
Instance record = new SparseInstance(attrInfo.size());
int attrIdx = 0;
for (RegressionResult regResult : row.getRegressionResults()) {
record.setValue(attrIdx++, regResult.getSlope());
record.setValue(attrIdx++, regResult.getIntercept());
}
record.setValue(attrIdx, row.getDestinationClass());
instances.add(record);
}
Returned destination class is in fact a class index. I've got two classes: "flower" and "tree" created by below snippet:
FastVector destValues = new FastVector();
destValues.addElement("tree");
destValues.addElement("flower");
Attribute destClassAttribute = new Attribute("destClass", destValues);
And here comes the problem - when I set the record destination class to '1'
I have my Instance set to "flower"
. But when I set the record to '0'
the last attribute is not set at all.
Shortly it looks like that:
record.setValue(attrIdx, 0);
gives such result in debugger:
{0 0.07017,1 -1.338295,2 -0.252162,3 1.377695}
and this:
record.setValue(attrIdx, 1);
gives the following:
{0 0.07017,1 -1.338295,2 -0.252162,3 1.377695, 4 "flower"}
Ok, the problem here is that I use SparseInstance
here, which is cutting values which equal 0
. I intuitively thought that it concerns only numeric attributes - and only their values are being erased - no affect to nominal ones. But I've missed this fragment in documentation:
this also includes nominal attributes -- the first nominal value (i.e. that which has index 0) will not require explicit storage, so rearrange your nominal attribute value orderings if necessary
In my example, the "tree"
value was not inserted, because its index equals 0
If you want to hold each of your nominal value, be sure that you will overcome it (if you still want to use SparseInstance
)
You may want to implement it this way:
FastVector destValues = new FastVector();
destValues.addElement("dummy");
destValues.addElement("tree");
destValues.addElement("flower");
Attribute destClassAttribute = new Attribute("destClass", destValues);
and then you will never use the "dummy"
destination class.
As you can see this "zero-cutting" is performed on destination class attribute as well.