How do I combine few weak learners into a strong classifier? I know the formula, but the problem is that in every paper about AdaBoost that I've read there are only formulas without any example. I mean - I got weak learners and their weights, so I can do what the formula tells me to do (multiply learner by its weight and add another one multiplied by its weight and another one etc.) but how exactly do I do that? My weak learners are decision stumps. They got attribute and treshold, so what do I multiply?
If I understand your question correctly, you have a great explanation on how boosting ensambles the weak classifiers into a strong classifier with a lot of images in these lecture notes:
Basically you are by taking the weighted combination of the separating hyperplanes creating a more complex decision surface (great plots showing this in the lecture notes)
Hope this helps.
To do it practically:
in page 42 you see the formulae for alpha_t = 1/2*ln((1-e_t)/e_t)
which easily can be calculated in a for loop, or if you are using some numeric library (I'm using numpy which is really great) directly by vector operations. The alpha_t
is calculated inside of the adaboost so I assume you already have these.
You have the mathematical formulae at page 38, the big sigma stands for sum over all.
is the weak classifier function and it returns either -1 (no) or 1 (yes).
is basically how good the weak classifier is and thus how much it has to say in the final decision of the strong classifier (not very democratic).
I don't really use forloops never, but I'll be easier to understand and more language independent (this is pythonish pseudocode):
for t in T: #over all weakclassifiers indices
response += alpha[t]*h[t](x)
return sign(response)
This is mathematically called the dot product between the weights and the weak-responses (basically: strong(x) = alpha*weak(x)).
This is what is happening inside strongclassifier(x): Separating hyperplane is basically decided in the function weak(x), so all x's which has weak(x)=1 is on one side of the hyperplane while weak(x)=-1 is on the other side of the hyperplane. If you think about it has lines on a plane you have a plane separating the plane into two parts (always), one side is (-) and the other one is (+). If you now have 3 infinite lines in the shape of a triangle with their negative side faced outwards, you will get 3 (+)'s inside the triangle and 1 or 2 (-)'s outside which results (in the strong classifier) into a triangle region which is positive and the rest negative. It's an over simplification but the point is still there and it works totally analogous in higher dimensions.