python machine-learning scikit-learn neural-network optuna

How to set hidden_layer_sizes in sklearn MLPRegressor using optuna trial

I would like to use [OPTUNA][1] with sklearn [MLPRegressor][1] model.

For almost all hyperparameters it is quite straightforward how to set OPTUNA for them. For example, to set the learning rate: learning_rate_init = trial.suggest_float('learning_rate_init ',0.0001, 0.1001, step=0.005)

My problem is how to set it for hidden_layer_sizes since it is a tuple. So let's say I would like to have two hidden layers where the first will have 100 neurons and the second will have 50 neurons. Without OPTUNA I would do:

MLPRegressor( hidden_layer_sizes =(100,50))

But what if I want OPTUNA to try different neurons in each layer? e.g., from 100 to 500, how can I set it? the MLPRegressor expects a tuple

Solution

You could set up your objective function as follows:

import optuna
import warnings
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error
warnings.filterwarnings('ignore')

X, y = make_regression(random_state=1)

X_train, X_valid, y_train, y_valid = train_test_split(X, y, random_state=1)

def objective(trial):

    params = {
        'learning_rate_init': trial.suggest_float('learning_rate_init ', 0.0001, 0.1, step=0.005),
        'first_layer_neurons': trial.suggest_int('first_layer_neurons', 10, 100, step=10),
        'second_layer_neurons': trial.suggest_int('second_layer_neurons', 10, 100, step=10),
        'activation': trial.suggest_categorical('activation', ['identity', 'tanh', 'relu']),
    }

    model = MLPRegressor(
        hidden_layer_sizes=(params['first_layer_neurons'], params['second_layer_neurons']),
        learning_rate_init=params['learning_rate_init'],
        activation=params['activation'],
        random_state=1,
        max_iter=100
    )

    model.fit(X_train, y_train)

    return mean_squared_error(y_valid, model.predict(X_valid), squared=False)

study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=3)
# [I 2021-11-11 18:04:02,216] A new study created in memory with name: no-name-14c92e38-b8cd-4b8d-8a95-77158d996f20
# [I 2021-11-11 18:04:02,283] Trial 0 finished with value: 161.8347337123744 and parameters: {'learning_rate_init ': 0.0651, 'first_layer_neurons': 20, 'second_layer_neurons': 40, 'activation': 'tanh'}. Best is trial 0 with value: 161.8347337123744.
# [I 2021-11-11 18:04:02,368] Trial 1 finished with value: 159.55535852658082 and parameters: {'learning_rate_init ': 0.0551, 'first_layer_neurons': 90, 'second_layer_neurons': 70, 'activation': 'relu'}. Best is trial 1 with value: 159.55535852658082.
# [I 2021-11-11 18:04:02,440] Trial 2 finished with value: 161.73980822730888 and parameters: {'learning_rate_init ': 0.0051, 'first_layer_neurons': 100, 'second_layer_neurons': 30, 'activation': 'identity'}. Best is trial 1 with value: 159.55535852658082.