machine-learning scikit-learn neural-network adaboost

Using scikit-learn's MLPClassifier in AdaBoostClassifier

For a binary classification problem I want to use the MLPClassifier as the base estimator in the AdaBoostClassifier. However, this does not work because MLPClassifier does not implement sample_weight, which is required for AdaBoostClassifier (see here). Before that, I tried using a Keras model and the KerasClassifier within AdaBoostClassifier but that did also not work as mentioned here .

A way, which is proposed by User V1nc3nt is to build an own MLPclassifier in TensorFlow and take into account the sample_weight.

User V1nc3nt shared large parts of his code but since I have only limited experience with Tensorflow, I am not able to fill in the missing parts. Hence, I was wondering if anyone has found a working solution for building Adaboost ensembles from MLPs or can help me out in completing the solution proposed by V1nc3nt.

Thank you very much in advance!

Solution

Based on the references, which you had mentioned, I have modified MLPClassifier to accommodate sample_weights.

Try this!

from sklearn.neural_network import MLPClassifier
from sklearn.datasets import load_iris
from sklearn.ensemble import AdaBoostClassifier
import numpy as np

class customMLPClassifer(MLPClassifier):
    def resample_with_replacement(self, X_train, y_train, sample_weight):

        # normalize sample_weights if not already
        sample_weight = sample_weight / sample_weight.sum(dtype=np.float64)

        X_train_resampled = np.zeros((len(X_train), len(X_train[0])), dtype=np.float32)
        y_train_resampled = np.zeros((len(y_train)), dtype=int)
        for i in range(len(X_train)):
            # draw a number from 0 to len(X_train)-1
            draw = np.random.choice(np.arange(len(X_train)), p=sample_weight)

            # place the X and y at the drawn number into the resampled X and y
            X_train_resampled[i] = X_train[draw]
            y_train_resampled[i] = y_train[draw]

        return X_train_resampled, y_train_resampled


    def fit(self, X, y, sample_weight=None):
        if sample_weight is not None:
            X, y = self.resample_with_replacement(X, y, sample_weight)
        
        return self._fit(X, y, incremental=(self.warm_start and
                                            hasattr(self, "classes_")))


X, y = load_iris(return_X_y=True)
adabooster = AdaBoostClassifier(base_estimator=customMLPClassifer())

adabooster.fit(X, y)