I'm implementing an application using AdaBoost to classify if an elephant is Asian or African elephant. My input data is:
Elephant size: 235 Elephant weight: 3568 Sample weight: 0.1 Elephant type: Asian
Elephant size: 321 Elephant weight: 4789 Sample weight: 0.1 Elephant type: African
Elephant size: 389 Elephant weight: 5689 Sample weight: 0.1 Elephant type: African
Elephant size: 210 Elephant weight: 2700 Sample weight: 0.1 Elephant type: Asian
Elephant size: 270 Elephant weight: 3654 Sample weight: 0.1 Elephant type: Asian
Elephant size: 289 Elephant weight: 3832 Sample weight: 0.1 Elephant type: African
Elephant size: 368 Elephant weight: 5976 Sample weight: 0.1 Elephant type: African
Elephant size: 291 Elephant weight: 4872 Sample weight: 0.1 Elephant type: Asian
Elephant size: 303 Elephant weight: 5132 Sample weight: 0.1 Elephant type: African
Elephant size: 246 Elephant weight: 2221 Sample weight: 0.1 Elephant type: African
I created a Classifier class:
import java.util.ArrayList;
public class Classifier {
private String feature;
private int treshold;
private double errorRate;
private double classifierWeight;
public void classify(Elephant elephant){
if(feature.equals("size")){
if(elephant.getSize()>treshold){
elephant.setClassifiedAs(ElephantType.African);
}
else{
elephant.setClassifiedAs(ElephantType.Asian);
}
}
else if(feature.equals("weight")){
if(elephant.getWeight()>treshold){
elephant.setClassifiedAs(ElephantType.African);
}
else{
elephant.setClassifiedAs(ElephantType.Asian);
}
}
}
public void countErrorRate(ArrayList<Elephant> elephants){
double misclassified = 0;
for(int i=0;i<elephants.size();i++){
if(elephants.get(i).getClassifiedAs().equals(elephants.get(i).getType()) == false){
misclassified++;
}
}
this.setErrorRate(misclassified/elephants.size());
}
public void countClassifierWeight(){
this.setClassifierWeight(0.5*Math.log((1.0-errorRate)/errorRate));
}
public Classifier(String feature, int treshold){
setFeature(feature);
setTreshold(treshold);
}
And I trained in main() a classifier which classifies by "size" and a treshold = 250 just like this:
main.trainAWeakClassifier("size", 250);
After my classifier classifies each elephant I count the classifier error, update weights of each sample (elephant) and count the weight of the classifier. My questions are:
How do I create the next classifier and how does it care about misclassified samples more(I know that sample weight is the key but how does it work cause I don't know how to implement it)? Did I create the first classifier properly?
Well, you compute the error rate and can classify the instances, but what you are missing is the update of the classifiers and combining them into one per the Ada Boost formula. Take a look at the algorithm here: Wikipedia's Ada Boost webpage