Search code examples
machine-learningneural-networkadaboost

Adaboost with neural networks


I implemented Adaboost for a project, but I'm not sure if I've understood adaboost correctly. Here's what I implemented, please let me know if it is a correct interpretation.

  1. My weak classifiers are 8 different neural networks. Each of these predict with around 70% accuracy after full training.
  2. I train all these networks fully, and collect their predictions on the training set ; so I have 8 vectors of predictions on the training set.

Now I use adaboost. My interpretation of adaboost is that it will find a final classifier as a weighted average of the classifiers I have trained above, and its role is to find these weights. So, for every training example I have 8 predictions, and I'm combining them using adaboost weights. Note that with this interpretation, the weak classifiers are not retrained during the adaboost iterations, only the weights are updated. But the updated weights in effect create new classifiers in each iteration.

Here's the pseudo code:

all_alphas = [] 
all_classifier_indices = []
initialize all training example weights to 1/(num of examples)
compute error for all 8 networks on the training set
for i in 1 to T:
      find the classifier with lowest weighted error.
      compute the weights (alpha) according to the Adaboost confidence formula
      Update the weight distribution, according to the weight update formula in Adaboost.
      all_alphas.append(alpha) 
      all_classifier_indices.append(selected_classifier)

After T iterations, there are T alphas and T classifier indices ; these T classifier indices will point to one of the 8 neural net prediction vectors.

Then on the test set, for every example, I predict by summing over alpha*classifier .

I want to use adaboost with neural networks, but I think I've misinterpreted the adaboost algorithm wrong..


Solution

  • Boosting summary:

    1- Train your first weak classifier by using the training data

    2- The 1st trained classifier makes mistake on some samples and correctly classifies others. Increase the weight of the wrongly classified samples and decrease the weight of correct ones. Retrain your classifier with these weights to get your 2nd classifier.

    In your case, you first have to resample with replacement from your data with these updated weights, create a new training data and then train your classifier over these new data.

    3- Repeat the 2nd step T times and at the end of each round, calculate the alpha weight for the classifier according to the formula. 4- The final classifier is the weighted sum of the decisions of the T classifiers.

    It is hopefully clear from this explanation that you have done it abit wrongly. Instead of retrain your network with the new data set, you trained them all over the original dataset. In fact you are kind of using random forest type classifier (except that you are using NN instead of decision trees) ensemble.

    PS: There is no guarantee that boosting increases the accuracy. In fact, so far all the boosting methods that I'm aware of were unsuccessful to improve the accuracy with NN as weak learners (The reason is because of the way that boosting works and needs a lengthier discussion).