Search code examples
kerasregressionvalueerrorensemble-learningvoting

How to add a neural network model with ML models in VotingRegressor?


Background of the Problem

I was trying to use a KerasRegressor model with the ML models (e.g. Lasso, Gradient Boost Regressor) for the purpose of building an ensemble method. I used the VotingRegressor() function of sklearn to group the models. However, when I add the KerasRegressor model in VotingRegressor(), I get the following error.

ValueError: The estimator KerasRegressor should be a regressor.

How Did I Try to Solve the Problem?

I searched on google by the error and I found only this page where I do not find the solution. Moreover, I tried to understand the document of the KerasRegressor. However, I do not know why I get the error as the document says that it is the implementation of the scikit-learn regressor API for Keras.

Then, My Question

Why did I get the error and what can I do to solve it?

Any help will be greatly appreciated :). Thanks!


Solution

  • From this issue there is no solution using keras as sklearn wrapper is not maintained and will be removed

    Fortunately scikeras package solve this issue.

    I advice you to read docs or tutorials but here a simple example using subclassing:

    !pip install scikeras
    
    import scikeras
    from tensorflow import keras
    from sklearn.datasets import make_regression
    from sklearn.ensemble import VotingRegressor
    from sklearn.linear_model import LinearRegression
    
    class MLPRegressor(KerasRegressor):
    
        def __init__(
            self,
            hidden_layer_sizes=(100, ),
            optimizer="adam",
            optimizer__learning_rate=0.001,
            epochs=10,
            verbose=0,
            **kwargs,
        ):
            super().__init__(**kwargs)
            self.hidden_layer_sizes = hidden_layer_sizes
            self.optimizer = optimizer
            self.epochs = epochs
            self.verbose = verbose
    
        def _keras_build_fn(self, compile_kwargs):
            model = keras.Sequential()
            inp = keras.layers.Input(shape=(self.n_features_in_))
            model.add(inp)
            for hidden_layer_size in self.hidden_layer_sizes:
                layer = keras.layers.Dense(hidden_layer_size, activation="relu")
                model.add(layer)
            out = keras.layers.Dense(1)
            model.add(out)
            model.compile(loss="mse", optimizer=compile_kwargs["optimizer"])
            return model
    
    # simple linear regression
    r1 = LinearRegression()
    # keras model wrapper
    r2= MLPRegressor(epochs=20)  
    
    
    X = (y/2).reshape(-1, 1)
    y = np.arange(100)
    
    #defining votting classifier
    vr = VotingRegressor([('lr', r1), ('MLPReg', r2)])
    
    vr.fit(X,y)
    

    VotingRegressor(estimators=[('lr', LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)), ('MLPReg', MLPRegressor(batch_size=None, build_fn=None, callbacks=None, epochs=20, hidden_layer_sizes=(100,), loss=None, metrics=None, model=None, optimizer='adam', random_state=None, run_eagerly=False, shuffle=True, validation_batch_size=None, validation_split=0.0, verbose=0, warm_start=False))], n_jobs=None, weights=None)