Search code examples
pythonscikit-learnskopt

Optimize hyperparameters hidden_layer_size MLPClassifier with skopt


How can I optimize the number of layers and hidden layer size in a neural network using MLPClassifier from sklearn and skopt?

Usually I'd specify my space something like:

Space([Integer(name = 'alpha_2', low = 1, high = 2),
       Real(10**-5, 10**0, "log-uniform", name='alpha_2')])

( let's say hyperparameters alpha_1 and alpha_2).

With the neural network implementation in sklearn I need to tune hidden_layer_sizes which is a tuple:

 hidden_layer_sizes : tuple, length = n_layers - 2, default=(100,)
     The ith element represents the number of neurons in the ith
     hidden layer.

How can I represent this in Space?


Solution

  • If you are using gp_minimize you can include the number of hidden layers and the neurons per layer as parameters in Space. Inside the definition of the objective function you can manually create the hyperparameter hidden_layer_sizes.

    This is an example from the scikit-optimize homepage, now using an MLPRegressor:

    import numpy as np
    from sklearn.datasets import load_boston
    from sklearn.neural_network import MLPRegressor
    from sklearn.model_selection import cross_val_score
    from skopt.space import Real, Integer, Categorical 
    from skopt.utils import use_named_args
    from skopt import gp_minimize
    
    boston = load_boston()
    X, y = boston.data, boston.target
    n_features = X.shape[1]
    
    reg = MLPRegressor(random_state=0)
    
    space=[
        Categorical(['tanh','relu'],name='activation'),
        Integer(1,4,name='n_hidden_layer'),
        Integer(200,2000,name='n_neurons_per_layer')]
    
    @use_named_args(space)
    
    def objective(**params):
        n_neurons=params['n_neurons_per_layer']
        n_layers=params['n_hidden_layer']
    
        # create the hidden layers as a tuple with length n_layers and n_neurons per layer
        params['hidden_layer_sizes']=(n_neurons,)*n_layers
    
        # the parameters are deleted to avoid an error from the MLPRegressor
        params.pop('n_neurons_per_layer')
        params.pop('n_hidden_layer')
    
        reg.set_params(**params)
    
        return -np.mean(cross_val_score(reg, X, y, cv=5, n_jobs=-1,
                                        scoring="neg_mean_absolute_error"))
    
    res_gp = gp_minimize(objective, space, n_calls=50, random_state=0)