I'd like to use the Stanford Classifier for text classification. My features are mostly textual, but there are some numeric features as well (e.g. the length of a sentence).
I started off with the ClassifierExample and replaced the current features by a simple real valued feature F
with value 100
if a stop light is BROKEN
and 0.1
otherwise, which results in the following code (apart from the makeStopLights()
function in line 10-16, this is just the code of the original ClassifierExample class):
public class ClassifierExample {
protected static final String GREEN = "green";
protected static final String RED = "red";
protected static final String WORKING = "working";
protected static final String BROKEN = "broken";
private ClassifierExample() {} // not instantiable
// the definition of this function was changed!!
protected static Datum<String,String> makeStopLights(String ns, String ew) {
String label = (ns.equals(ew) ? BROKEN : WORKING);
Counter<String> counter = new ClassicCounter<>();
counter.setCount("F", (label.equals(BROKEN)) ? 100 : 0.1);
return new RVFDatum<>(counter, label);
}
public static void main(String[] args) {
// Create a training set
List<Datum<String,String>> trainingData = new ArrayList<>();
trainingData.add(makeStopLights(GREEN, RED));
trainingData.add(makeStopLights(GREEN, RED));
trainingData.add(makeStopLights(GREEN, RED));
trainingData.add(makeStopLights(RED, GREEN));
trainingData.add(makeStopLights(RED, GREEN));
trainingData.add(makeStopLights(RED, GREEN));
trainingData.add(makeStopLights(RED, RED));
// Create a test set
Datum<String,String> workingLights = makeStopLights(GREEN, RED);
Datum<String,String> brokenLights = makeStopLights(RED, RED);
// Build a classifier factory
LinearClassifierFactory<String,String> factory = new LinearClassifierFactory<>();
factory.useConjugateGradientAscent();
// Turn on per-iteration convergence updates
factory.setVerbose(true);
//Small amount of smoothing
factory.setSigma(10.0);
// Build a classifier
LinearClassifier<String,String> classifier = factory.trainClassifier(trainingData);
// Check out the learned weights
classifier.dump();
// Test the classifier
System.out.println("Working instance got: " + classifier.classOf(workingLights));
classifier.justificationOf(workingLights);
System.out.println("Broken instance got: " + classifier.classOf(brokenLights));
classifier.justificationOf(brokenLights);
}
}
In my understanding of linear classifiers, feature F
should make the classification task pretty easy - after all, we just need to check whether the value of F
is greater than some threshold. However, the classifier returns WORKING
on every instance in the test set.
Now my question is: Have I made something wrong, do I need to change some other parts of the code as well for real-valued features to work or is there something wrong with my understanding of linear classifiers?
Your code looks fine. Note that typically with a Maximum Entropy classifier you provide binary valued features (1 or 0).
Here is some more reading on Maximum Entropy classifiers: http://web.stanford.edu/class/cs124/lec/Maximum_Entropy_Classifiers
Look at slide titled: "Feature-Based Linear Classifiers" to see the specific probability calculation for Maximum Entropy classifiers.
Here is the formula for your example case with 1 feature and 2 classes ("works", "broken"):
probability(c1) = exp(w1 * f1) / total
probability(c2) = exp(w2 * f1) / total
total = exp(w1 * f1) + exp(w2 * f1)
w1 is the learned weight for "works" and w2 is the learned weight for "broken"
The classifier selects the higher probability. Note that f1 = (100 or 0.1) your feature value.
If you consider your specific example data, since you have (2 classes, 1 feature, feature is always positive), it is not possible to build a maximum entropy classifier that will separate that data, it will always guess all one way or the other.
For sake of argument say w1 > w2
.
Say v > 0
is your feature value (either 100 or 0.1).
Then w1 * v > w2 * v
, thus exp(w1 * v) > exp(w2 * v)
, so you'll always assign more probability to class1 regardless of what value v has.